How to extract dates from a .txt file in Java

In this tutorial, we are going to learn how to extract the dates from a text file. Our task is to print all the dates present in a given text file. We are going to use the concept of regular expressions in Java to solve this task.

We make use of the packages java.util.regex.Pattern and java.util.regex.Matcher to implement regular expressions. The Java Pattern class compiles the given regular expression. The Java Matcher class finds all the matches. The find() method returns true if there are any matches to the regular expression pattern, else it returns false. The group() method returns the input sequence that was matched by the previous match result.

We make use of the packages java.io.BufferedReader and java.io.FileReader to read the text present in the given file. The Java BufferedReader class reads the text from a character-based input stream and the Java FileReader class reads the data present in the file.

Note: This particular program extracts all the dates in the ‘dd/mm/yyyy’ and ‘dd-mm-yyyy’ formats only. If another date format is required, kindly change the regular expression accordingly.

You can check: How to count number of sentences in a text file in Java

Java program to extract dates from a .txt file

Steps:

  1. Form a Regular Expression that matches with all the dates of the format “dd/mm/yyyy” or “dd-mm-yyyy” and compile this expression using the compile() method.
    Regular Expression = “(0?[1-9]|[12][0-9]|3[01])[/|-](0?[1-9]|1[0-2])[/|-][0-9]{4}”.
    This also checks if the ‘dd’ ranges from 1 to 31, ‘mm’ ranges from 1 to 12 and ‘yyyy’ is a 4 digit number to extract only valid dates.
  2. Now, using the FileReader class, read the data present in the given file into a BufferedReader object. Syntax:
    BufferedReader lineReader = new BufferedReader(new FileReader(location));

    Here, the variable location is a String which contains the path of the required text file.

  3. Now, using the readLine() method in the BufferedReader class, read the content of the file line by line.
  4. After reading each line, check if there are any dates present in that line using the matcher() method.
  5. The find() method will return true if there are dates present in that line in the ‘dd/mm/yyyy’ or ‘dd-mm-yyyy’ format. If the dates are present, print all of them using the group() method.

Java Code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class ExtractDates {
  
  public static void DateExtraction(String regex) {
      Pattern pattern = Pattern.compile(regex);
      String line;
      int flag = 0;
      try {
      	//Reading the data present in a given file from FileReader using the BufferedReader.
      	//Here, the location is the path of the file. Kindly change the location to the correct location of the required file in your system. 
      	String location = "C:\\Users\\Dell\\eclipse-workspace\\ExtractDates\\date.txt"; 
      	BufferedReader lineReader = new BufferedReader(new FileReader(location));
      	
      	//Reading the contents of the file line by line
        while ((line = lineReader.readLine()) != null) {
             //Matching each line with the regular expression pattern to check if there are any dates present.
             Matcher matchPattern = pattern.matcher(line);
             //Enters the while loop only if there is a match found.
             while(matchPattern.find()) {
                //Printing all the dates present in that line. 
                System.out.println(matchPattern.group());
                flag = 1;
             }
        }
        
        if(flag == 0) {
           //This condition is entered when there are no dates found in any line.
           System.out.println("The file does not contain dates.");
        }
        lineReader.close();
      } 
      catch (IOException e) {
      	// Occurs whenever an input or output operation is failed or interpreted. 
      	//For example, trying to read from a file that does not exist. 
      	System.out.println("Error: " + e.getMessage());
      }
  }
  
  public static void main(String[] args) {
        //Regular Expression to recognize all the dates of the format 'dd/mm/yyyy' or 'dd-mm-yyyy'.       
        String regex = "(0?[1-9]|[12][0-9]|3[01])[/|-](0?[1-9]|1[0-2])[/|-][0-9]{4}";
        DateExtraction(regex);	
  }
}

Output:

  1. Extracting all the dates present in the following text file:
    Today is 21-09-2000 and yesterday was 20/09/2000. 
    Tomorrow will be 22/09/2000 and day after Tomorrow will be 23-09-2000.

    The output of the program:

    21-09-2000
    20/09/2000
    22/09/2000
    23-09-2000
  2. Extracting all the dates present in the following text file:
    Hello, good morning! 
    Have a wonderful day.

    The Output of the program:

    The file does not contain dates.

Also read, How to check if a date is a weekend or not in Java

Leave a Reply

Your email address will not be published. Required fields are marked *