Uncategorized

Write Arabic text on file, Language Encoding on File using C#


 

Writing Arabic to text file is little tricky, One of my recent project provide me a new challenge to write Arabic text on the csv (comma separate) file. Writing a csv file is very easy in any language as I thought it once but it become little threatening when it comes  to write Arabic, Spanish on the csv file, in windows service. In my provided task I have to read the Arabic & English text and write it to the cvs file, but when I attempt to write it shows some unusual text and most of the tme ???? Question marks on the csv file. I goggled on it, but do not get solution, then I start digging the streamwrtier, I dig down the streamwriter class and its function with properties, goggled about the encoding use by the System.IO and the stream writer. I found that there are different encoding used by the languages.

I found below mentioned link useful to understand the encoding methodology, use in the .net

http://msdn.microsoft.com/en-us/library/system.text.encoding.default.aspx

I tried many forums (http://forums.digitalpoint.com/showthread.php?t=537748) to get my work done but failed and found the solution myself.

My below mentioned code is self-explanatory, I used the binary writer instead of stream writer because we consider our production servers as they are dump in support , except xml.

//I tried this code first but I read Arabic property but when streamwrtier tries to write

Arabic, it got fails and write some junk Unicode values instead of proper Arabic

StreamReader sr = new StreamReader(@”C:\utf-8.txt”, Encoding.Default);

string str = sr.ReadLine();

StreamWriter sw = new StreamWriter(@”D:\windows-1256.txt”, false,

Encoding.GetEncoding(720))

str = str.Normalize(NormalizationForm.FormKD);

sw.Write(str);

sw.Flush();

//I also check the conversion done by the coding in Arabic , runtime which is also reflect the proper result to me, I found that there is something in streamwriter,

var input = “لا”;

var encoding = Encoding.GetEncoding(720);

var result = encoding.GetBytes(input);

Console.WriteLine(string.Join(“, “, result));

// Console.ReadLine();

var bytes = new byte[] { 225, 199 };

var result2 = encoding.GetString(bytes);

Console.WriteLine(result2);

sw.WriteLine(result2);

Console.ReadLine();

sw.Close();

To found any solution, I skipped the streamwriter and found alternative,  which is binary writer, reference to MCTS 70536 binary writer, writes stream as binary , without concern what is in,

StreamWriter is for text and BinaryWriter writes the actual binary representation of what you want to write.Soruce : http://stackoverflow.com/questions/4614318/whats-the-difference-between-a-streamwriter-and-a-binarywriter

Now with the binary approach, I get my work done, and it is working for 2,3 languages as I check only for Arabic ,English and Spanish. Check the below code.

FileStream writeStream;

writeStream = new FileStream(“D:\\arabicc.txt”, FileMode.Create);

BinaryWriter writeBinay = new BinaryWriter(writeStream);

string f = “عرض النتائج لـ”;

var input = “ومن وتزويده الشّعبين”;

input = input + f;

writeBinay.Write(input);

writeBinay.Close();

arabic text, on text file 

arabic text, on text file

Note: You can refer above image, without installing any font and any other stuff on pc, you can see the arabic each word, on my destination file.🙂

 

 

 

Standard

One thought on “Write Arabic text on file, Language Encoding on File using C#

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s