# 데이터 엔지니어링 핸드북

> Clean Markdown view of GeekNews topic #17963. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=17963](https://news.hada.io/topic?id=17963)
- GeekNews Markdown: [https://news.hada.io/topic/17963.md](https://news.hada.io/topic/17963.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2024-11-27T09:46:01+09:00
- Updated: 2024-11-27T09:46:01+09:00
- Original source: [github.com/DataExpert-io](https://github.com/DataExpert-io/data-engineer-handbook)
- Points: 46
- Comments: 0

## Summary

데이터 엔지니어가 되기 위한 프로젝트, 인터뷰, 책, 커뮤니티, 뉴스레터 등을 포함합니다. 데이터 엔지니어링 관련 기술 회사와 블로그, 화이트 페이퍼, 소셜 계정, 팟캐스트, 뉴스레터, 교육 코스 등의 정보도 함께 제공됩니다. 특히, 데이터 엔지니어링에 필요한 필독 서적 3권으로 이미 번역본이 출간된 "견고한 데이터 엔지니어링", "데이터 중심 애플리케이션 설계", "머신러닝 시스템 설계"를 추천하고 있네요.

## Topic Body

- 데이터 엔지니어가 되기 위한 모든 자료들을 모은 Repo   
  - 프로젝트/인터뷰/책/커뮤니티/뉴스레터 들 모음   
- 처음이라면 [2024 데이터 엔지니어링 진입 로드맵](https://news.hada.io/topic?id=17939)을 읽는 것부터 시작  
- 꼭 읽어야 하는 책 3가지 와 주요 서적 25권  
  - 견고한 데이터 엔지니어링   
  - 데이터 중심 애플리케이션 설계  
  - 머신러닝 시스템 설계  
- 꼭 가입해야할 커뮤니티 5개 와 주요 커뮤니티들 10여개   
  - [DE] [DataExpert.io Community Discord](https://discord.gg/JGumAXncAK)  
  - [DE] [Data Talks Club Slack](https://datatalks.club/slack)  
  - [DE] [Data Engineer Things Community](https://www.dataengineerthings.org/aboutus/)  
  - [ML] [AdalFlow Discord](https://discord.com/invite/ezzszrRZvT)  
  - [ML] [Chip Huyen MLOps Discord](https://discord.gg/dzh728c5t3)  
- 데이터 엔지니어링 관련 기술 회사 와 블로그들  
  - 카테고리별 회사 정리 : Orchestration, Data Lake/Cloud, Warehouse, Data Quality, 교육, Analytics/Visualization, Data Integration, Modern OLAP, LLM 응용, 실시간 데이터   
  - 블로그 : [Netflix](https://netflixtechblog.com/tagged/big-data) , [Uber](https://www.uber.com/blog/houston/data/?uclick_id=b2f43229-f3f4-4bae-bd5d-10a05db2f70c) , [Databricks](https://www.databricks.com/blog/category/engineering/data-engineering) , [Airbnb](https://medium.com/airbnb-engineering/data/home) , [Amazon AWS Blog](https://aws.amazon.com/blogs/big-data/) , [Microsoft Data Architecture Blogs](https://techcommunity.microsoft.com/t5/data-architecture-blog/bg-p/DataArchitectureBlog) , [Microsoft Fabric Blog](https://blog.fabric.microsoft.com/) , [Oracle](https://blogs.oracle.com/datawarehousing/) , [Meta](https://engineering.fb.com/category/data-infrastructure/) , [Onehouse](https://www.onehouse.ai/blog)  
- 데이터 엔지니어링 화이트 페이퍼   
  - [A Five-Layered Business Intelligence Architecture](https://ibimapublishing.com/articles/CIBIMA/2011/695619/695619.pdf)  
  - [Lakehouse:A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics](https://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf)  
  - [Big Data Quality: A Data Quality Profiling Model](https://link.springer.com/chapter/10.1007/978-3-030-23381-5_5)  
  - [The Data Lakehouse: Data Warehousing and More](https://arxiv.org/abs/2310.08697)  
  - [Spark: Cluster Computing with Working Sets](https://dl.acm.org/doi/10.5555/1863103.1863113)  
  - [The Google File System](https://research.google/pubs/the-google-file-system/)  
  - [Building a Universal Data Lakehouse](https://www.onehouse.ai/whitepaper/onehouse-universal-data-lakehouse-whitepaper)  
  - [XTable in Action: Seamless Interoperability in Data Lakes](https://arxiv.org/abs/2401.09621)  
  - [MapReduce: Simplified Data Processing on Large Clusters](https://research.google/pubs/mapreduce-simplified-data-processing-on-large-clusters/)  
- 주요 소셜 계정과 팟캐스트   
- 꼭 구독해야할 뉴스레터 4개 및 그외 20개 이상의 뉴스레터들   
  - [DataEngineer.io Newsletter](https://blog.dataengineer.io)  
  - [Joe Reis](https://joereis.substack.com)  
  - [Start Data Engineering](https://www.startdataengineering.com)  
  - [Data Engineering Weekly](https://www.dataengineeringweekly.com)  
- 각종 교육 코스들

## Comments



_No public comments on this page._
