grep better than IDE/LSP
What is this about
I do not mean specifically grep
but a general plain text
search. Executed in a terminal via command or inside your text editor/IDE
CTRL+F
- does not matter. My premise is that plain text search is
better.
I would like to argue that against IDE mechanisms like: find all
references
, go to definition
and other functionalities like that which can
be found in LSP (Language Server Protocol) powered text editors as well.
My experience
For more than 7 years I used an IDE exclusively before I even attempted writing code without one. I was completely depended on it to the point that I was not able to work without its help. Which happened in cases such as: project would not load, or I would be in an environment without an IDE or there would be any other issue with it.
Those were the situations which initially forced me to work with code using just plain text - both in editing and searching. After a lot of time that is my preferred way of working with code. Why?
Specifics
Big output = big context
Whenever I bring up this topic the first argument against plain text
search is always: but you will get more output than the actual usages
of the searched variable/function
. Yes! That's good.
Thanks to that we can not only see the usages of searched variable/function but also all mentions in commends, documentation, readme files, notes, all commented out lines and disabled by preprocessor directives. Also we can see the context in which given word is used - outside of just the variable usages. So we can better understand the name and maybe discover that there are multiple different things named similarly. We can also notice a different part of the code which already does what we are trying to do. In summary we can see more, we can learn more and make better decisions.
Most people are just overwhelmed with the amount of results from plain text search because there is a lot of lines in the output. It's just a matter of getting used to it. Typically you will get around 50 to 100 lines. That's not that many. Additionally each output line is preceded with the path to the file which not only gives us additional and useful context but also allows you to filter the output and read only relevant parts - you also might be surprised to discover new, weird locations or see something you were not expecting.
There is a downside to this method - short names. One, two letter names are
typically not unique enough to give us back meaningful output. But you would
be surprised that three letter names are quite unique already. Also searching
for overused names like: name
or value
is problematic. In those
cases IDE/LSP wins - no question about that. But that might be a sign of
a need to refactor.
IDE/LSP does not show you everything
My biggest issue with find all references
is that it does not show
all places in which searched symbol is used. It can't do that in comments but
it also skips preprocessor directives #IF
or files which are not
part of the project. This is problematic when you have different build
configurations: DEBUG
and RELEASE
or generated code.
Also IDE won't help you when you are searching for the same thing in
different languages at once. Imagine searching for variable which is used both
in your backend code and html
templated file, or in
SQL
script and your Java
/C#
files.
Refactoring tools like rename all usages
give you a false sense of
security in those situations - using those is also the reason why comments
mentioning variables get outdated.
Regarding commented out code. I don't like the approach of leaving commented out code in the repository but if it's there I want to know. It is also a good source of knowledge and text based search allows you to find those places.
Fine, but go to definition
is clearly better
If I don't know where something is defined then usinggo to definitionis faster.
That's true but if you don't know where something is defined that means that the codebase is not well organized.
I know that the real projects are messy and we do not always have a say in
all matters. The project might be old or created by inexperienced developers.
But in all those scenarios using plain text search will allow you to learn more
about such codebase. The easy to used go to definition
will hide this
mess from you not allowing to learn about the project and improve it.
Regular expressions /= new User/
Practicing the plain text searching sooner or later you will star using
regular expression. Not only it is additional skill which can be used in
your programming careerer but also will allow to do more precise searches
which cannot be performed with find all references
.
Examples:
- We are looking not just the variable but places in which it is assigned.
- When we are searching for function calls with specific values.
- Searching for a couple of words at the same time e.g.:
/user\.(name|age) =/
- Searching for a sequence of words e.g.:
/var .* = get.*FromDb\(/
- Searching for variations of a given word e.g.:
/user|users|user_array/
Versatility
Making yourself familiar with plain text search is much more versatile than
IDE functionalities. This skill can be applied not only to code but also to
logs, text files with notes and documentation, data files (like
.csv
). It also allows you to work with languages and technologies
your IDE is not supporting or that don't have such tools. Is also might happen
that you are not allow to install any tools and have to relay on what is
already on the machine (like on servers or customer computers).
The same goes when using someone's else machine. Your friend, parents, girlfriend - they probably don't have any IDE and you might want or have to do a little codding (some small script or something) on their computer. All you will have there will be a bare bone text editor and console. When you are accustom with working on just plain text and plain text search you are much more capable in any situation ;)
How to start
In order to do what I'm suggesting the plain text search must be convenient.
Most text editors and IDEs which have a search window opened via
CTRL+SHIFT+F
are not that convenient to use. You have to reach
for your mouse to use it, subsequent searches are not that easy to execute,
repeating previous searches as well. You also lose your output from the
previous search when executing a new one.
It can be done efficiently in an editor but I would suggest to get familiar
with your terminal. On Windows use PowerShell console with
Select-String
command. On Linux you have a variety of
grep
commands to chose from. If you are using git
then a git grep
is a great option as it is blazingly fast.
But that's where specific advices end because rest of the specifics depend on your environment, experience and type of work you do. You have to learn how the terminal/console works in itself. You can create a helper scripts or aliases to make the usage of the commands faster and easier.
Here are some examples:
Windows
A handy PowerShell function:
function slsr { [CmdletBinding()] Param($filter, $pattern) Get-ChildItem -Recurse -File -Filter $filter | Select-String $pattern }
This one allows you to recursively search through all the files with given pattern. For Example:
PS> slsr *.cs ' = new User' PS> slsr *.js 'function foo' PS> slsr *.html 'input .* name="password"'
And here is a function for git grep
to allow for faster
execution without pagination and in case insensitive manner:
function Git-Grep ($filePattern, $pattern) { if ($filePattern.Trim() -eq "*") { git --no-pager grep -i $pattern } else { git --no-pager grep -i $pattern -- "**/$filePattern" } }
Linux
Here is an alias that I keep in my .bashrc
:
# Without that bash will replace patterns like *.c with actual files list # so I would have to close those inside "" every time, to avoid that we # disable the expansion before executing the function and then enable it back. function reset_expansion(){ CMD="$1";shift;$CMD "$@";set +f;} alias sr='set -f; reset_expansion sr'; function sr() { # ussage: sr *.c regex # extend search to header files or source files if a pattern to *.c/*.h is given local stop_recursion=${3:-false} if [ $stop_recursion = false ]; then if [ "$1" = "*.c" ] || [ "$1" = "*.cpp" ]; then sr '*.h' $2 true; sr '*.hpp' $2 true; fi if [ "$1" = "*.h" ] || [ "$1" = "*.hpp" ]; then sr '*.c' $2 true; sr '*.cpp' $2 true; fi fi local inside_git_repo="$(git rev-parse --is-inside-work-tree 2> /dev/null)" if [ "$inside_git_repo" ]; then echo "git --no-pager grep --line-number -i \"$2\" -- \"$1\"" >&2 git --no-pager grep --line-number -i "$2" -- "$1" git --no-pager grep --line-number -i "$2" -- "**/$1" else echo "grep -nir \"$2\" --include=\"$1\"" >&2 grep -nirI "$2" --include="$1" fi }
Usages:
bash> sr *.c 'struct User' bash> sr *.js 'function foo' bash> sr *.html 'input .* name="password"'
It's quite a complex one 'cause it has both grep
and git
grep
in it and chooses which one to use depending if I'm in a git
repository or not. I also extended it so that it will search in
.h
files when a .c
file pattern is provided and the
other way around so I don't have to execute one search for header files and
other for implementation files. Handy for C and C++ but not much use for
anything else.
End
The most important is just getting familiar with plain text search as it. The exact tool is not important. And terminal is just the beginning. You will want to be able to jump into the file locations from those search results quickly so some integration with your text editor will be necessary. But all of that is for you to figure out.
Good luck with your searches 🔍