Skip to content Skip to sidebar Skip to footer

Regex For GST Identification Number (GSTIN)

What is the regex for the GST number in India. You can read more about the GST numbers from https://cleartax.in/s/know-your-gstin On a summary level the number is represented as L

Solution 1:

Here is the regex and checksum validation for GSTIN

\d{2}[A-Z]{5}\d{4}[A-Z]{1}[A-Z\d]{1}[Z]{1}[A-Z\d]{1}

enter image description here

Format details

  1. First 2 digits of the GST Number will represent State Code as per the Census (2011).
  2. Next 10 digits will be same as in the PAN number of the taxpayer.
    • First five will be alphabets
    • Next four will be numbers
    • Last will be check code
  3. The 13th digit will be the number of registration you take within a state i.e. after 9, A to Z is considered as 10 to 35 .
  4. 14th digit will be Z by default.
  5. Last would be the check code.

Here is the code for verifying/validating the gstin number using the checksum in js

function checksum(g){
    let regTest = /\d{2}[A-Z]{5}\d{4}[A-Z]{1}[A-Z\d]{1}[Z]{1}[A-Z\d]{1}/.test(g)
     if(regTest){
        let a=65,b=55,c=36;
        return Array['from'](g).reduce((i,j,k,g)=>{ 
           p=(p=(j.charCodeAt(0)<a?parseInt(j):j.charCodeAt(0)-b)*(k%2+1))>c?1+(p-c):p;
           return k<14?i+p:j==((c=(c-(i%c)))<10?c:String.fromCharCode(c+b));
        },0); 
    }
    return regTest
}

console.log(checksum('27AAPFU0939F1ZV'))
console.log(checksum('27AASCS2460H1Z0'))
console.log(checksum('29AAGCB7383J1Z4'))

GST regex and checksum in various programming languages


Solution 2:

Here is the regex that I came up with:

/^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$/

According to H&R Block India GSTIN guide, the 13th 'digit' (entity code) is "an alpha-numeric number (first 1-9 and then A-Z)". That is, zero is not allowed and A-Z represent 10-35. Hence the [1-9A-Z] is more accurate than [0-9].

The last digit, "check digit", is indeed alphanumeric: [0-9A-Z]. I have independently confirmed by obtaining and testing actual GSTINs.


Solution 3:

The correct validation for GSTIN should be

^([0][1-9]|[1-2][0-9]|[3][0-7])([a-zA-Z]{5}[0-9]{4}[a-zA-Z]{1}[1-9a-zA-Z]{1}[zZ]{1}[0-9a-zA-Z]{1})+$

The first 2 digits denote the State Code (01-37) as defined in the Code List for Land Regions.

The next 10 characters pertain to PAN Number in AAAAA9999X format.

13th character indicates the number of registrations an entity has within a state for the same PAN.

14th character is currently defaulted to "Z"

15th character is a checksum digit

This regex pattern accommodates lower and upper case.


Solution 4:

To add to above answers, this answer also provides code snippet for checksum digit

public static final String GSTINFORMAT_REGEX = "[0-9]{2}[a-zA-Z]{5}[0-9]{4}[a-zA-Z]{1}[1-9A-Za-z]{1}[Z]{1}[0-9a-zA-Z]{1}";
public static final String GSTN_CODEPOINT_CHARS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static String getGSTINWithCheckDigit(String gstinWOCheckDigit) throws Exception {
            int factor = 2;
            int sum = 0;
            int checkCodePoint = 0;
            char[] cpChars;
            char[] inputChars;

            try {
                if (gstinWOCheckDigit == null) {
                    throw new Exception("GSTIN supplied for checkdigit calculation is null");
                }
                cpChars = GSTN_CODEPOINT_CHARS.toCharArray();
                inputChars = gstinWOCheckDigit.trim().toUpperCase().toCharArray();

                int mod = cpChars.length;
                for (int i = inputChars.length - 1; i >= 0; i--) {
                    int codePoint = -1;
                    for (int j = 0; j < cpChars.length; j++) {
                        if (cpChars[j] == inputChars[i]) {
                            codePoint = j;
                        }
                    }
                    int digit = factor * codePoint;
                    factor = (factor == 2) ? 1 : 2;
                    digit = (digit / mod) + (digit % mod);
                    sum += digit;
                }
                checkCodePoint = (mod - (sum % mod)) % mod;
                return gstinWOCheckDigit + cpChars[checkCodePoint];
            } finally {
                inputChars = null;
                cpChars = null;
            }
        }

Source: GST Google Group Link, Code Snippet Link


Solution 5:

Try this. It is working as per GSTIN.

^([a-zA-Z0-9_\.\-])+\@(([a-zA-Z0-9\-])+\.)+([a-zA-Z0-9]{2,4})+$

Post a Comment for "Regex For GST Identification Number (GSTIN)"