Improving PixelMelt's Kindle Web Deobfuscator
12 points by mtlynch
12 points by mtlynch
So someone took pretty accurate method, reintroduced what the original author said was yielding errors, to introduce errors again, and is claiming it’s an improvement?
I liked the original article a lot, but this one is not bringing any insights.
So someone took pretty accurate method, reintroduced what the original author said was yielding errors, to introduce errors again, and is claiming it’s an improvement?
I'm confused by this criticism.
This is exactly how science is supposed to work.
What do you feel that Terence Eden did wrong?
Thank you for your constructive criticism on my blog post - I appreciate it.
The original author said:
OCR probably need words and sentences to work well.
Which is what I did. It does produce better results for the majority of the text.
The original frequently confused . with • and , with ' - my method doesn't. Of course, OCR will always have some edge cases.
Nevertheless, I'd be grateful for your insight and expertise into what I could do to improve the OCR process.
Sorry, I was annoyed at something else, completely unrelated to this article, and let that spill into that comment. I apologize for unnecessary negative criticism.
The part that I missed about the original article is that it did character level OCR as opposed to whole page as you did.