Project 1

Deadline: Saturday February 29, 2020 11:00pm
This is version 2, posted on February 16. The assignment is largely the same but contains additional fields: an optional email address and first & last names instead of a single name.

Task

To write programs and perform comparison between JSON and Google Protocol buffers for serialization and deserialization of a given set of records. Comparison is to be done with respect to time, size (of message) and rates of serialization/deserialization.

Input Record Format:

<id>,<LastName>,<FirstName>,[<email>]:<Course1>,<Marks1>:<Course2>,<Marks2>:...:<CourseN1>,<MarksN>

Example:

1. 346-78-4956,Niels,Pirrone,npirrone2p@twitter.com:ID251,51:LD118,41:IT466,47
2. 201-41-2324,Patel,Dhruv:cs210,39:ece124,83

Note that the first record contains an email address for the student Pirrone Niels along with grades for three classes. The second record contains no email address for the student Dhruv Patel and grades for two classes. The sequence numbers are not part of the record. See input.txt for examples.

Files

1. JSON Format:

Convert the given record (record_input.txt) to json format. An illustration for the input records above is shown:

[{
	"lastname": "John", “firstname”:”Amy”,
	"CourseMarks": [{
		"CourseScore": 23, "CourseName": "cs324" }, {
		"CourseScore": 23, "CourseName": "ece124" }],
	"id": 201401022, “email”:”am@random.com”}, {
	"lastname": "Dhruv",”firstname”:”Patel”,
	"CourseMarks": [{
		"CourseScore": 39, "CourseName": "cs210" }, {
		"CourseScore": 83, "CourseName": "ece124" }], "id": 20141232
}]

2. Google ProtoBuf Format:

Convert the records to the given Google protobuf format protocol_defn.proto.

Naming convention for files

Serialize all the records and write the result to file result_protobuf

Deserialize the records back to to the original format (as in the input) to the file output_protobuf.txt

Useful Links

Submission Format

  • Put all your code in a folder Assignment1_{your-netid}

  • Write a txt/md file which illustrates your learnings from this assignment.

  • Write a script “run.sh” that uses following options/flags:

    • -c: to compile the code — Not needed for python implementations
    • -s -j <INPUT_FILE>: to serialize the given input records to json format and write it to “result.json”
    • -s -p <INPUT_FILE>: to serialize the given input records to protobuf format and write it to "result_protobuf”
    • -d -j <JSON_FILE>: to deserialize the given json file and write plaintext records to “output_json.txt”
    • -d -p <PROTOBUF_FILE>: to deserialize the given protobuf file and write plain text records to "output_protobuf.txt”
    • -t -j <INPUT_FILE>: to perform metric measurement (time/size/rate) on the given input file with json as intermediate format and print it
    • -t -p <INPUT_FILE>: to perform metric measurement (time/size/rate) on the given input file with protobuf as intermediate format and print it.
  • Submit an archive as “Assignment1_{your-netid}.zip”. (Look at Key Point 9)

Key Points

  1. Languages allowed: Java/C++/Python3 (json.type, proto.type) (choosing Java may help in next assignment).

  2. Do not add any additional special characters or white spaces while deserializing or serializing. “output_json.txt” and “output_protobuf.txt” should match exactly with the input file provided.

  3. Time should be measured in milliseconds (ms).

  4. Along with total time, print the rate of serialization/deserialization (amount of data / time taken). E.g., 10 Kbits/second → 10Kbps. Note that there are 8 bits in a byte so you will need to multiply your byte count by 8. You may use powers of 2 for kilobytes (1024) and megabytes (1024 * 1024) values.

  5. The time taken for conversion need not include the time taken for file I/O operations.

  6. Take a look at ordered dictionaries to match the output format to the one given for comparison.

  7. Failure to not abide with the format will result in zero points. The assignment will be graded in an automated fashion.

  8. This is an individual assignment.

  9. Put your files in a folder (Assignment1_{your-netid}) and then zip it as Assignment1_{your-netid}.zip.

  10. DO NOT ADD OUTPUT OR INPUT FILES IN YOUR SUBMISSION. This will be considered as an attempt to cheat the automatic checker.

Last modified August 30, 2023.
recycled pixels