Okapi, or “What if ripgrep Could Edit?”

66 points by buffalo7


ClashTheBunny

Interesting circle of history.

Grep was produced in a night based on the source code of Ed.

Grep is in fact short for the ed command g/<regular expression>/p shortened to g/re/p and then dropping the slashes.

What is value of this over ed, which comes with most unix like systems?

Okapi to GNU ed Translations
Okapi Command GNU ed Command Explanation
okapi III /III/ Literal string match. Identical in both.
okapi "Dan[^l ]\b" /Dan[^l ]\>/ Uses \> for the end-of-word boundary constraint.
okapi "Mich\wl" -e "Michel" v/Michel/g/Mich\wl/ v skips lines containing "Michel", then g searches the remainder.
okapi Fli -c ..15 /^.\{,14\}Fli/ ^ anchors the search, \{,14\} allows up to 14 characters before Fli.

bazzargh

I did a project like this around 2008 https://theknowledgeexchangeblog.com/2014/02/25/a-unique-insight-into-uk-new-towns/ ... the New Towns Record was originally scanned documents that were pushed through some proprietary hypertext system and released on a set of CDs in '96 by our commercial library service. A decade later, I got the job of reformatting it as html for online publication. It was about 5GB of text; not much now but as much as my PC could handle back then. I found that the previous releases had been full of repeated "scannos" and asked to spend a couple of weeks fixing those up too.

What I did was more like: count all unique space-separated words in the text-without-markup, filter out any that were in a dictionary (which I added to over time), then start at the top of the list replacing the most frequent typos first with emacs - replacing similar errors many times was way quicker than proofreading the text, tho I did that too. The okapi author says they "needed the precision of regex combined with the power of a text editor."...but, that's what dired-do-find-regexp-and-replace does in emacs.

This was also one of my first uses of git, which had only recently been released - it let me quickly checkpoint what I was doing after each round of edits, where svn was just too slow.

I felt at the time there was a gap in the market for a spellchecker that finds likely OCR errors instead of likely typos (but it's very niche)

mxey

Cool. I think there are some GUI editors that let you edit from inside the search results window. Was it BBEdit?