The Computer Oracle

How to find out why is text not searchable in a PDF (and make it searchable)

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Lost Jungle Looping

--

Chapters
00:00 How To Find Out Why Is Text Not Searchable In A Pdf (And Make It Searchable)
01:13 Accepted Answer Score 7
01:54 Answer 2 Score 2
02:17 Answer 3 Score 0
03:14 Thank you

--

Full question
https://superuser.com/questions/561589/h...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#pdf #search

#avk47



ACCEPTED ANSWER

Score 7


  • It may have a custom font encoding that assigns code points to characters in a way that is incompatible with established encodings such as ASCII or UTF-8/Unicode.

  • It may render characters individually out of sequence

  • It may have had characters flattened to paths

See Stack Overflow questions How do you debug PDF files? and the now deleted PDF Font encoding — why can't I copy text from a PDF?

To make it text searchable, the best way may be to go back to the original source (e.g. a Word document) and use a different process to produce the PDF. Alternatively you could try rendering your current PDF as a bitmap and then using OCR, but this will be tedious and produce poor results.




ANSWER 2

Score 2


I found a way around this problem. I did tools -> edit document text, then for each page, I hit Control-A (select all), then right-clicked and went to properties, and changed the font to something else. After I did this, the text was searchable and I could copy the text!




ANSWER 3

Score 0


this might be old but characters encoding issues in compound path pdf are still an issue today I solved by

  • open unsearchable text file with illustrator
  • Save a Copy as pdf with preset Smallest File Size
  • then open file with acrobat
  • Scan & OCR > Recognize Text with your settings
  • now search ⌘ + f should work

Test source

Environment

  • sw_vers macOS 14.4.1 (23E224) x86_64
  • Adobe Illustrator 24.0.2
  • Adobe Acrobat Pro DC Continuous Release | Version 2021.007.20091