[Solved] Converting .txt Spark Output to .csv


This is a simple approach that converts the txt output data into a data structure (that can easily be written into a csv file).

The basic idea is using data structures along with the amount of headers / columns in order to parse entry sets from the one liner txt output.

Have a look at the code comments, every TODO 4 U” means work for you, mostly because I cannot really guess what you need to do at those positions in the code (like how to get the headers).

This is just a main method that does its work straight forward. You may want to understand what it does and apply changes that make the code meet your requiremtens. Input and output are just Strings that you have to create, receive or process yourself.

public static void main(String[] args) {

    // TODO 4 U: get the values for the header somehow
    String headerLine = "NAME, UNIV, DEGREE, COURSE";

    // TODO 4 U: read the txt output
    String txtOutput = "John MIT Bachelor ComputerScience Mike UB Master ComputerScience";

    /*
     * then split the header line
     * (or do anything similar, I don't know where your header comes from)
     */
    String[] headers = headerLine.split(", ");

    // store the amount of headers, which is the amount of columns
    int amountOfColumns = headers.length;

    // split txt output data by space
    String[] data = txtOutput.split(" ");

    /*
     * declare a data structure that stores lists of Strings,
     * each one is representing a line of the csv file
     */
    Map<Integer, List<String>> linesForCsv = new TreeMap<Integer, List<String>>();

    // get the length of the txt output data
    int a = data.length;

    // create a list of Strings containing the headers and put it into the data structure
    List<String> columnHeaders = Arrays.asList(headers);
    linesForCsv.put(0, columnHeaders);

    // declare a line counter for the csv file
    int l = 0;
    // go through the txt output data in order to get the lines for the csv file
    for (int i = 0; i < a; i++) {
        // check if there is a new line to be created
        if (i % amountOfColumns == 0) {
            /*
             * every time the amount of headers is reached,
             * create a new list for a new line in the csv file
             */
            l++; // increment the line counter (even at 0 because the header row is inserted at 0)
            linesForCsv.put(l, new ArrayList<String>()); // create a new line-list
            linesForCsv.get(l).add(data[i]); // add the data to the line-list
        } else {
            // if there is no new line to be created, store the data in the current one
            linesForCsv.get(l).add(data[i]);
        }
    }

    // print the lines stored in the map
    // TODO 4 U: write this to a csv file instead of just printing it to the console
    linesForCsv.forEach((lineNumber, line) -> {
        System.out.println("Line " + lineNumber + ": " + String.join(",", line));
    });
}

0

solved Converting .txt Spark Output to .csv