In this article, we will develop a program to read CSV files and validate the data. The CSV file is comma-delimited and contains user details retrieved from the server, meaning we cannot modify the CSV file; we can only read it. Therefore, our task is to read the CSV file, split it, and handle cases where the address may contain additional commas.
Parsing .csv files can be challenging, especially when the file may contain either comma-separated strings, comma-separated quoted strings, or a combination of both. However, the solution we will discuss accommodates all possibilities.
Using the function provided below, we can extract values from a CSV file, even when some cells contain commas.
class Program
{
static void Main()
{
string csvFilePath = @"C:\Path\to\your\csvfile.csv";
List<string[]> csvData = new List<string[]>();
using (StreamReader reader = new StreamReader(csvFilePath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
string[] columns = SplitCSVLine(line);
csvData.Add(columns);
}
}
// Now you can use csvData for further processing
foreach (string[] row in csvData)
{
foreach (string column in row)
{
Console.Write(column + " | ");
}
Console.WriteLine();
}
}
static string[] SplitCSVLine(string line)
{
List<string> columns = new List<string>();
Regex csvParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string[] values = csvParser.Split(line);
foreach (string value in values)
{
string trimmedValue = value.Trim(' ', '"');
columns.Add(trimmedValue);
}
return columns.ToArray();
}
}
This code reads a CSV file line by line, splitting each line into columns while handling commas inside double quotes appropriately. It then stores each row of the CSV file as an array of strings in a list csvData.
Csv file data:
Below is the CSV file, The CSV file contains user information and we are going to read the file and split it into the cells, However, the user address will contain commas.
Name,Country,Address Chop-suey Chinese,Mexico,"Forsterstr, 57" Consolidated Holdings,Germany,"24, place Kleber" Eastern Connection,UK,"67, rue des Cinquante Otages"
Now i wan to read a csv file which has a comma separated data within double quotes, as you can see in the above csv example,our csv file contains ‘,’ in address field.We want the output like below.
Output
Name:"Chop-suey Chinese",country:"Mexico",address:"Forsterstr, 57" Name:"Consolidated Holdings",country:"Germany",address:"24, place Kleber" Name:"Eastern Connection",country:"UK",address:"67, rue des Cinquante Otages"
we can do this using several ways , we are going to discuss a few of them.We have created the below model class for parsing a csv file.
public class UserDetail { public string name { get; set; } public string country { get; set; } public string address { get; set; } }
Method 1: You can use LinqToCSV using nuGet
Install Nuget: Install-Package LinqToCsv
public void ParseUsingLINQtoCSV() { string csv = @"E:\MyJson\results.csv"; CsvContext cc = new CsvContext(); CsvFileDescription inputFileDescription = new CsvFileDescription { SeparatorChar = ',',//speciify seprator FirstLineHasColumnNames = true,//speciify whether csv has header or not IgnoreUnknownColumns = true }; IEnumerable<UserDetail> userDetails = cc.Read<UserDetail>(csv, inputFileDescription); foreach (var d in userDetails) { Debug.Print(($@"Name:""{d.name}"",country:""{d.country}"",address:""{d.address}""")); } }
Method 2: You can use TextFieldParser
Using the Microsoft.VisualBasic.FileIO.TextFieldParser class we can parse the csv file,TextFieldParser will handle parsing a delimited file, TextReader or Stream where some fields are enclosed in quotes and some are not.
For adding Microsoft.VisualBasic.FileIO, Right-click on your project and select Add Reference… and In the Reference Manager, expand Assemblies option and then select Framework. Then check the box for Microsoft.VisualBasic and click OK button.
public void ParseUsingTextFieldParser() { //For adding Microsoft.VisualBasic.FileIO,Right-click on your project and select Add Reference... //In the Reference Manager, expand Assemblies and select Framework. Then check the box for Microsoft.VisualBasic and click OK. string csv = @"E:\MyJson\results.csv"; TextFieldParser csvParser = new TextFieldParser(csv); csvParser.HasFieldsEnclosedInQuotes = true; csvParser.SetDelimiters(","); string[] values; List<UserDetail> userDetails = new List<UserDetail>(); int i = 0; while (!csvParser.EndOfData) { values = csvParser.ReadFields(); if (i > 0)//to avoid reading header { UserDetail userDetail = new UserDetail(); userDetail.name = values[0];//name userDetail.country = values[1];//country userDetail.address = values[2]; userDetails.Add(userDetail); } i++; } csvParser.Close(); foreach (var d in userDetails) { Debug.Print(($@"Name:""{d.name}"",country:""{d.country}"",address:""{d.address}""")); } }
Method 3: We can also achieve this task using RegEx as below.
public void ParseCsvUsingRegex() { string csv = @"E:\MyJson\cs.csv"; List<Product> products = new List<Product>(); using (var streamReader = File.OpenText(csv)) { int i = 0; while (!streamReader.EndOfStream) { var line = streamReader.ReadLine(); if (!string.IsNullOrEmpty(line)) { Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))"); //values also contian,comma that's using regex parser string[] values = CSVParser.Split(line); if (i > 0)//for avoiding header of the csv file { Product product = new Product(); product.title = values[0];//title product.cost_price = values[1];//cost_price //parsing string usng JsonConvert nuget package only apply the value in which , appears product.summary = JsonConvert.DeserializeObject<string>(values[2]);//address products.Add(product); } } i++; } } foreach (var d in products) { Debug.Print(($@"Name:""{d.title}"",cost_price:""{d.cost_price}"",summary:""{d.summary}""")); } }
Thank for the reading if you have any query please comment.
Documents with .csv expansion address plain text documents that contain records of information with comma isolated values. Each line in a CSV document is another record from the arrangement of records contained in the document. Such documents are created when information move is planned starting with one capacity framework then onto the next. Since everything applications can perceive records isolated by comma, import of such information documents to data set is done helpfully. Practically all accounting sheet applications, for example, Microsoft Excel or OpenOffice Calc can import CSV absent a lot of exertion. Information imported from such records is organized in cells of a bookkeeping sheet for portrayal to client.
There might be a discretionary header line showing up as the primary line of the document with a similar organization as ordinary record lines. This header will contain names comparing to the fields in the document and ought to contain similar number of fields as the records in the remainder of the document .
Inside the header and each record, there might be at least one fields, isolated by commas. Each line ought to contain similar number of fields all through the record. Spaces are viewed as a feature of a field and ought not be overlooked. The last field in the record should not be trailed by a comma.
Each field might possibly be encased in twofold statements . In the event that fields are not encased with twofold statements, then, at that point, twofold statements may not show up inside the fields.