Search String Support

View Research Poster

The Problem

Analyzing digital media often involves searching for names, terms, phone numbers, email addresses, or other types of patterns. Current forensic analysis tools make it easy for law enforcement to search evidence for individual key terms and single words. However, when law enforcement wishes to complete a more advance search, such as for phone numbers or credit card numbers, the technique becomes much more difficult. While most forensic tools have the capability to use regular expressions for searching, the technique of generating efficient, effective, and accurate regular expressions is quite difficult. Also, the process of running a regular expression search across a piece of evidence can be quite time consuming. If an investigator were to generate poor regular expressions and run them against his or her case, he or she could waste days worth of time and effort trying to obtain accurate results. Most importantly, if an investigator is relying on the results of poorly constructed regular expressions, then they could be missing crucial pieces of evidence in a case.

Making the problem worse is the fact that there is no sharing of these regular expressions among law enforcement investigators. As a result each investigator spends a huge amount of effort coming up with regular expressions that someone else has probably already created at a previous date. So for example someone who is investigating a case in credit card fraud might spend hours generating regular expressions that accurately identify any and all formats that different credit card numbers may be stored in across a piece of media. Chances are that investigations similar to this have already been done, and that the regular expressions for this type of case already exist somewhere.


The URI research group on Search String Support is working on a web-interface tool that will solve both of the aforementioned problems. This web interface tool will allow users to input keywords and data relating to the searches that they wish to complete, and the generator will output regular expressions relating to their search. The regular expression generator will be able to generate search expressions that are compatible with EnCase Forensics, Forensic Toolkit (FTK), X-Ways Forensics, and I-Look Investigator. The web interface will also allow users to store regular expressions that they have generated and search regular expressions that have previously been generated. This will greatly reduce the amount of time that investigators spend generating accurate regular expressions.


The Search String Support Research Group has worked to integrate their research into an online interactive tool released on the Electronic Crime Technology Center of Excellence's website. The final product, known as the Law Enforcement Search String Assistant, stores sets of commonly used lists of keywords that investigators may use and has an interactive web interface to help create regular expressions.


This research is supported by the National Institute of Justice.