Tuesday, July 5, 2011

Validating CSV Files

What is CsvValidator ?
  A Java framework which validates any CSV files something similar to XML validation using XSD.

Why should I use this ?
  You don't have to use this and in fact its easy to write something your own and also checkout its source code for reference.

Why did I write this ?
  Some of our projects integrate with third party application which exchanges information in CSV files so I thought of writing a generic validator which can be hooked in multiple projects or can be used by QA for integration testing.

What is the license clause ?
 GNU GPL v2

Are there any JUnit test cases for me checkout ?
 Yes, source

How to integrate in my existing project ?

Just add the Jar which can be downloaded from here CsvValidator.jar and you are good.

Instantiate CsvValidator constructor which takes these 3 arguements

         // filename is the the file to be validated and here is a sample
        // list - defines all the fields in the above csv file ( a field has index, type, isOptional, regex )
        // last argument is the file delimiter and it can be anything and not just comma


Checkout this sample code 


 public static void main(String[] args) {
        boolean optional = true;
        boolean notOptional = false;

        List list = new ArrayList();
     
        list.add(new Field(1, Type.NUMBER, notOptional));
        list.add(new Field(2, Type.NUMBER, notOptional));
        list.add(new Field(3, Type.TEXT, notOptional));
        list.add(new Field(4, Type.TEXT, notOptional));
        list.add(new Field(5, Type.NUMBER, notOptional));
        list.add(new Field(6, "Purchase Date", Type.DATE, notOptional, "yyyy-MM-dd HH:mm:ss"));
        list.add(new Field(7, Type.NUMBER, notOptional));
        list.add(new Field(8, Type.NUMBER, notOptional));
        list.add(new Field(9, "Campaign ID", Type.TEXT, optional));
        list.add(new Field(10, "Promo Code", Type.TEXT, optional));
        list.add(new Field(11, Type.TEXT, notOptional));
        list.add(new Field(12, Type.NUMBER, optional));
        list.add(new Field(13, Type.NUMBER, optional));
        list.add(new Field(14, Type.TEXT, notOptional));

         
        CsvValidator validator1 = new CsvValidatorImpl("somefile.txt", list, "\\|");


        if (!validator1.isValid()) {
            System.out.println(validator1.getValidationDetails());
        }

    }

Can my QA use it in a stand alone mode ?
 Yes, just checkout the spec.txt file which your QA needs to create.

java -jar "CsvValidator.jar" csv-file.txt spec.txt

first line in spec.txt is the delimiter (, ' | etc)
  All other lines contains each field information on a separate line for example
 
    Currency,T,R,
    Date of Purchase,D,R,yyyy-MM-dd HH:mm:ss

    first column - field name helps in understanding validation results 
    second column - type (T, N, D ) represents Text, Number, Date
    third column - Required (R, O) represents Required or Optional
    fourth column - regex 

  (Regex for dates )
    
"yyyy.MM.dd G 'at' HH:mm:ss z"2001.07.04 AD at 12:08:56 PDT
"EEE, MMM d, ''yy"Wed, Jul 4, '01
"h:mm a"12:08 PM
"hh 'o''clock' a, zzzz"12 o'clock PM, Pacific Daylight Time
"K:mm a, z"0:08 PM, PDT
"yyyyy.MMMMM.dd GGG hh:mm aaa"02001.July.04 AD 12:08 PM
"EEE, d MMM yyyy HH:mm:ss Z"Wed, 4 Jul 2001 12:08:56 -0700
"yyMMddHHmmssZ"010704120856-0700
"yyyy-MM-dd'T'HH:mm:ss.SSSZ"2001-07-04T12:08:56.235-0700

 Regex cheat sheet for all others cheat sheet