# 브라우저에서 PDF와 이미지 OCR 직접 실행하기

> Clean Markdown view of GeekNews topic #14128. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=14128](https://news.hada.io/topic?id=14128)
- GeekNews Markdown: [https://news.hada.io/topic/14128.md](https://news.hada.io/topic/14128.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2024-04-03T10:51:01+09:00
- Updated: 2024-04-03T10:51:01+09:00
- Original source: [simonwillison.net](https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/)
- Points: 22
- Comments: 0

## Topic Body

- Tesseract.js 를 이용하여 이미지와 PDF 파일의 내용을 직접 읽어냄  
- 서버없이 브라우저에서만 실행되어 데이터가 외부로 전혀 나가지 않음   
- 이 코드는 Claude 3 Opus 와 GPT-4 를 이용해서 작성됨 : 기본 코드와 Prompt 들도 같이 공개

## Comments


_No public comments on this page._