At long last I've finished the CSV parser I've been working on. Most of the technical problems have been string matching problems, and that a string is a list of integers that can be matched singly prefixing a dollar sign to the char that should be matched.
The parser uses a state machine implemented by using a process and messages the next char until it reaches the end of the file/string at which point it messages an eof atom and awaits the process to message back the parsed CSV. In the end the parser used quite a lot of erlang's features including processes, funs and parametrised macros and the end result was pretty clean. It can take a plain string or an IO device such as a file as the string source which is handled in a nice way using funs to get the next char. I found the switch from OOP to functional confusing at first since I wanted to use an input stream but the functional method I discovered is probably smaller than the Java stream based approach I would have used otherwise.
Other notable CSV parsers include ppolv's and an FSM OTP behaviour from Praveen Ray of Yellowfish. I'm really impressed by the OTP behaviour, I can imagine this would improve reuse once comfortable with erlang and the OTP.
The parser uses a state machine implemented by using a process and messages the next char until it reaches the end of the file/string at which point it messages an eof atom and awaits the process to message back the parsed CSV. In the end the parser used quite a lot of erlang's features including processes, funs and parametrised macros and the end result was pretty clean. It can take a plain string or an IO device such as a file as the string source which is handled in a nice way using funs to get the next char. I found the switch from OOP to functional confusing at first since I wanted to use an input stream but the functional method I discovered is probably smaller than the Java stream based approach I would have used otherwise.
Other notable CSV parsers include ppolv's and an FSM OTP behaviour from Praveen Ray of Yellowfish. I'm really impressed by the OTP behaviour, I can imagine this would improve reuse once comfortable with erlang and the OTP.
No comments:
Post a Comment