Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4536

Pattern Search with Grep in Java

$
0
0

1. Overview

In this tutorial – we’ll learn how to search for a pattern in a given file/s – using Java and third party libraries such as Unix4J and Grep4J.

2. Background

Unix has a powerful command called grep – which stands for “global regular expression print“. It searches for the pattern or a regular expression within a given set of files.

One can use zero or more options along with grep command to enrich the search result which we would look into details in coming section.

If you’re using Windows, you can install bash as mentioned in the post here.

3. With Unix4J library

First, let’s see how to use Unix4J library to grep a pattern in a file.

In the following example – we will look at how to translate the Unix grep commands in Java.

3.1. Build Configuration

Add the following dependency on your pom.xml or build.gradle:

<dependency>
    <groupId>org.unix4j</groupId>
    <artifactId>unix4j-command</artifactId>
    <version>0.4</version>
</dependency>

3.2. Example with Grep

Sample grep in Unix:

grep "NINETEEN" dictionary.txt

The equivalent in Java is:

@Test 
public void whenGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 4;
    File file = new File("dictionary.txt");
    List<Line> lines = Unix4j.grep("NINETEEN", file).toLineList(); 
    
    assertEquals(expectedLineCount, lines.size());
}

Another example is where we can use inverse text search in a file. Here’s the Unix version of the same:

grep -v "NINETEEN" dictionary.txt

Here’s the Java version of above command:

@Test
public void whenInverseGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 178687;
    File file = new File("dictionary.txt");
    List<Line> lines 
      = Unix4j.grep(Grep.Options.v, "NINETEEN", file). toLineList();
    
    assertEquals(expectedLineCount, lines.size()); 
}

Lets see, how we can use regular expression to search for a pattern in a file. Here’s the Unix version to count all the regular expression pattern found in whole file:

grep -c ".*?NINE.*?" dictionary.txt

Here’s the Java version of above command:

@Test
public void whenGrepWithRegex_thenCorrect() {
    int expectedLineCount = 151;
    File file = new File("dictionary.txt");
    String patternCount = Unix4j.grep(Grep.Options.c, ".*?NINE.*?", file).
                          cut(CutOption.fields, ":", 1).toStringResult();
    
    assertEquals(expectedLineCount, patternCount); 
}

4. With Grep4J

Next – let’s see how to use Grep4J library to grep a pattern in a file residing either locally or somewhere in remote location.

In the following example – we will look at how to translate the Unix grep commands in Java.

4.1. Build Configuration

Add the following dependency on your pom.xml or build.gradle:

<dependency>
    <groupId>com.googlecode.grep4j</groupId>
    <artifactId>grep4j</artifactId>
    <version>1.8.7</version>
</dependency>

4.2. Grep Examples

Sample grep in Java i.e. equivalent of:

grep "NINETEEN" dictionary.txt

Here’s the Java version of command:

@Test 
public void givenLocalFile_whenGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 4;
    Profile localProfile = ProfileBuilder.newBuilder().
                           name("dictionary.txt").filePath(".").
                           onLocalhost().build();
    GrepResults results 
      = Grep4j.grep(Grep4j.constantExpression("NINETEEN"), localProfile);
    
    assertEquals(expectedLineCount, results.totalLines());
}

Another example is where we can use inverse text search in a file. Here’s the Unix version of the same:

grep -v "NINETEEN" dictionary.txt

And here’s the Java version:

@Test
public void givenRemoteFile_whenInverseGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 178687;
    Profile remoteProfile = ProfileBuilder.newBuilder().
                            name("dictionary.txt").filePath(".").
                            filePath("/tmp/dictionary.txt").
                            onRemotehost("172.168.192.1").
                            credentials("user", "pass").build();
    GrepResults results = Grep4j.grep(
      Grep4j.constantExpression("NINETEEN"), remoteProfile, Option.invertMatch());
    
    assertEquals(expectedLineCount, results.totalLines()); 
}

Lets see, how we can use regular expression to search for a pattern in a file. Here’s the Unix version to count all the regular expression pattern found in whole file:

grep -c ".*?NINE.*?" dictionary.txt

Here’s the Java version:

@Test
public void givenLocalFile_whenGrepWithRegex_thenCorrect() {
    int expectedLineCount = 151;
    Profile localProfile = ProfileBuilder.newBuilder().
                           name("dictionary.txt").filePath(".").
                           onLocalhost().build();
    GrepResults results = Grep4j.grep(
      Grep4j.regularExpression(".*?NINE.*?"), localProfile, Option.countMatches());
    
    assertEquals(expectedLineCount, results.totalLines()); 
}

5. Conclusion

In this quick tutorial, we illustrated searching for a pattern in a given file/s using Grep4j and Unix4J.

The implementation of these examples can be found in the GitHub project – this is an Maven based project, so it should be easy to import and run as it is.

Finally, you can naturally do some of the basics of grep-like functionality using the regex functionality in the JDK as well.


Viewing all articles
Browse latest Browse all 4536

Trending Articles