grep better than IDE/LSP

What is this about

I do not mean specifically grep but a general plain text search. Executed in a terminal via command or inside your text editor/IDE CTRL+F - does not matter. My premise is that plain text search is better.

I would like to argue that against IDE mechanisms like: find all references, go to definition and other functionalities like that which can be found in LSP (Language Server Protocol) powered text editors as well.

My experience

For more than 7 years I used an IDE exclusively before I even attempted writing code without one. I was completely depended on it to the point that I was not able to work without its help. Which happened in cases such as: project would not load, or I would be in an environment without an IDE or there would be any other issue with it.

Those were the situations which initially forced me to work with code using just plain text - both in editing and searching. After a lot of time that is my preferred way of working with code. Why?

Specifics

Big output = big context

Whenever I bring up this topic the first argument against plain text search is always: but you will get more output than the actual usages of the searched variable/function. Yes! That's good.

Thanks to that we can not only see the usages of searched variable/function but also all mentions in commends, documentation, readme files, notes, all commented out lines and disabled by preprocessor directives. Also we can see the context in which given word is used - outside of just the variable usages. So we can better understand the name and maybe discover that there are multiple different things named similarly. We can also notice a different part of the code which already does what we are trying to do. In summary we can see more, we can learn more and make better decisions.

Most people are just overwhelmed with the amount of results from plain text search because there is a lot of lines in the output. It's just a matter of getting used to it. Typically you will get around 50 to 100 lines. That's not that many. Additionally each output line is preceded with the path to the file which not only gives us additional and useful context but also allows you to filter the output and read only relevant parts - you also might be surprised to discover new, weird locations or see something you were not expecting.

There is a downside to this method - short names. One, two letter names are typically not unique enough to give us back meaningful output. But you would be surprised that three letter names are quite unique already. Also searching for overused names like: name or value is problematic. In those cases IDE/LSP wins - no question about that. But that might be a sign of a need to refactor.

IDE/LSP does not show you everything

My biggest issue with find all references is that it does not show all places in which searched symbol is used. It can't do that in comments but it also skips preprocessor directives #IF or files which are not part of the project. This is problematic when you have different build configurations: DEBUG and RELEASE or generated code.

Also IDE won't help you when you are searching for the same thing in different languages at once. Imagine searching for variable which is used both in your backend code and html templated file, or in SQL script and your Java/C# files. Refactoring tools like rename all usages give you a false sense of security in those situations - using those is also the reason why comments mentioning variables get outdated.

Regarding commented out code. I don't like the approach of leaving commented out code in the repository but if it's there I want to know. It is also a good source of knowledge and text based search allows you to find those places.

Fine, but go to definition is clearly better

If I don't know where something is defined then using go to definition is faster.

That's true but if you don't know where something is defined that means that the codebase is not well organized.

I know that the real projects are messy and we do not always have a say in all matters. The project might be old or created by inexperienced developers. But in all those scenarios using plain text search will allow you to learn more about such codebase. The easy to used go to definition will hide this mess from you not allowing to learn about the project and improve it.

Regular expressions /= new User/

Practicing the plain text searching sooner or later you will star using regular expression. Not only it is additional skill which can be used in your programming careerer but also will allow to do more precise searches which cannot be performed with find all references.

Examples:

Versatility

Making yourself familiar with plain text search is much more versatile than IDE functionalities. This skill can be applied not only to code but also to logs, text files with notes and documentation, data files (like .csv). It also allows you to work with languages and technologies your IDE is not supporting or that don't have such tools. Is also might happen that you are not allow to install any tools and have to relay on what is already on the machine (like on servers or customer computers).

The same goes when using someone's else machine. Your friend, parents, girlfriend - they probably don't have any IDE and you might want or have to do a little codding (some small script or something) on their computer. All you will have there will be a bare bone text editor and console. When you are accustom with working on just plain text and plain text search you are much more capable in any situation ;)

How to start

In order to do what I'm suggesting the plain text search must be convenient. Most text editors and IDEs which have a search window opened via CTRL+SHIFT+F are not that convenient to use. You have to reach for your mouse to use it, subsequent searches are not that easy to execute, repeating previous searches as well. You also lose your output from the previous search when executing a new one.

It can be done efficiently in an editor but I would suggest to get familiar with your terminal. On Windows use PowerShell console with Select-String command. On Linux you have a variety of grep commands to chose from. If you are using git then a git grep is a great option as it is blazingly fast.

But that's where specific advices end because rest of the specifics depend on your environment, experience and type of work you do. You have to learn how the terminal/console works in itself. You can create a helper scripts or aliases to make the usage of the commands faster and easier.

Here are some examples:

Windows

A handy PowerShell function:

  function slsr
  {
    [CmdletBinding()]
    Param($filter, $pattern)
    Get-ChildItem -Recurse -File -Filter $filter | Select-String $pattern
  }

This one allows you to recursively search through all the files with given pattern. For Example:

  PS> slsr *.cs ' = new User'
  PS> slsr *.js 'function foo'
  PS> slsr *.html 'input .* name="password"'

And here is a function for git grep to allow for faster execution without pagination and in case insensitive manner:

  function Git-Grep ($filePattern, $pattern) {
      if ($filePattern.Trim() -eq "*") {
          git --no-pager grep -i $pattern
      } else {
          git --no-pager grep -i $pattern -- "**/$filePattern"
      }
  }

Linux

Here is an alias that I keep in my .bashrc:

# Without that bash will replace patterns like *.c with actual files list
# so I would have to close those inside "" every time, to avoid that we
# disable the expansion before executing the function and then enable it back.
function reset_expansion(){ CMD="$1";shift;$CMD "$@";set +f;}
alias sr='set -f; reset_expansion sr'; function sr() { # ussage: sr *.c regex
    # extend search to header files or source files if a pattern to *.c/*.h is given
    local stop_recursion=${3:-false}
    if [ $stop_recursion = false ]; then
        if [ "$1" = "*.c" ] || [ "$1" = "*.cpp" ]; then sr '*.h' $2 true; sr '*.hpp' $2 true; fi
        if [ "$1" = "*.h" ] || [ "$1" = "*.hpp" ]; then sr '*.c' $2 true; sr '*.cpp' $2 true; fi
    fi
    local inside_git_repo="$(git rev-parse --is-inside-work-tree 2> /dev/null)"
    if [ "$inside_git_repo" ]; then
        echo "git --no-pager grep --line-number -i \"$2\" -- \"$1\"" >&2
        git --no-pager grep --line-number -i "$2" -- "$1"
        git --no-pager grep --line-number -i "$2" -- "**/$1"
    else
        echo "grep -nir \"$2\" --include=\"$1\"" >&2
        grep -nirI "$2" --include="$1"
    fi
}

Usages:

  bash> sr *.c 'struct User'
  bash> sr *.js 'function foo'
  bash> sr *.html 'input .* name="password"'

It's quite a complex one 'cause it has both grep and git grep in it and chooses which one to use depending if I'm in a git repository or not. I also extended it so that it will search in .h files when a .c file pattern is provided and the other way around so I don't have to execute one search for header files and other for implementation files. Handy for C and C++ but not much use for anything else.

End

The most important is just getting familiar with plain text search as it. The exact tool is not important. And terminal is just the beginning. You will want to be able to jump into the file locations from those search results quickly so some integration with your text editor will be necessary. But all of that is for you to figure out.

Good luck with your searches 🔍