Pages Navigation Menu

"Coding is EC" , "Learn it EC"

Converting Symbols, Accent Letters to English Alphabet in Java

 
If Special characters like viz. é à î _ @ are needed to be converted to normal ones like e a i _ @ following code is useful
 
This program works fine in java (purely for the purpose of removing diacritical marks aka accents).

It basically converts all accented characters into their deAccented counterparts followed by their combining diacritics. Now you can use a regex to strip off the diacritics.
 

import java.text.Normalizer;
import java.util.regex.Pattern;
public class Accent2Deaccent {

public String deAccent(String str) {
    String nfdNormalizedString = Normalizer.normalize(str, Normalizer.Form.NFD);
    Pattern pattern = Pattern.compile("\p{InCombiningDiacriticalMarks}+");
    return pattern.matcher(nfdNormalizedString).replaceAll("");
}

 public static void main(String[] args) {
        System.out.println(Accent2Deaccent.deAccent("é à î _ @"));
    }

}