CS 3950 Introduction to Computer Science Research

General Information

Professor:David Choffnes
Room:Richards Hall 275
Time:Friday, 9:50am-11:30am
Instructor Office Hours:TBD
Location: 613 ISEC
Teaching Assistants:Lucianna Kiffer
TA Email:kiffer.l@husky.neu.edu
TA Office Hours:TBD
Class Forum:On Piazza

Course Description

Provides students with an introduction to research in the fields of computer science, information science, data science, and cybersecurity. Explores how the scientific method is applied to these fields, covers the breadth of sub-areas of speciality that exist, and gives students practice on how to locate and read scientific literature in different sub-areas. Also provides students with an overview of graduate education in these fields.
The class will once per week for a 100-minute session.

Goals and Format

By the end of this course, I expect you to:

As this course is closest to a seminar course, the structure will consist of three components:
Lectures on Basics of Research The first few weeks will consist of lectures and discussions on the basics of computer science, research, and graduate studies. There will weekly assignments consisting of homework and background reading.
Reading and Discussing Papers The middle few weeks of the course will consist of reading and discussing papers from different areas of computer science. The focus will be on different styles of research, and how the results are presented.
Paper presentations The final few weeks of the class will consist of student presentations of research papers in groups. Each group will be expected to make 10 minutes presentations on papers of their choice (subject to constraints discussed in class), followed by leading a 10–15 minute discussion of the paper.

Prerequisites

The official prerequisite for this course is CS 2500, or permission of the instructor. You will only need a basic knowledge of programming to take this course. This course will be largely discussion-based, and you will be expected to actively participate in class.

Class Forum

The class forum is on Piazza. Why Piazza? Because they have a nice web interface, as well as iPhone and Android apps. Piazza is the best place to ask questions about projects, programming, debugging issues, exams, etc. To keep things organized, please tag all posts with the appropriate hashtags, e.g. #meeting1, #homework1, etc. I will also use Piazza to broadcast announcements to the class. Bottom line: unless you have a private problem, post to Piazza before writing me/the TA an email.

Schedule, Topics / Lecture Slides, and Assigned Readings

Meeting DateTopic / SlidesReadingsComments
Jan. 11 Introduction / What is CS Research? What do we mean by "science" and where is the Science in Computer Science? [1], [2] Hw. 1 Out
Jan. 18 Overview of CS research areas Hw. 1 In, Hw. 2 Out
Jan. 25 How to read (and write) a (good) research paper [3], [4], [5, stop at "getting started"], [6] Hw. 2 In, Hw. 3 Out
Feb. 1 Systems, Networking, Security, Privacy One of: [7], [8], [9], [10] See poll on Piazza. Hw. 3 In, Hw. 4 Out
Feb. 8 Artificial Intelligence/Machine Learning/Data Science/Natural Language Processing - Background: [11]
- [12] (Required), [13], [14]
- Pick partners for final presentation
Hw. 4 In, Hw. 5 Out
Feb. 15 Formal Methods, Programming Languages, Software Engineering One of: [15], [16], [17], [18] See poll on Piazza. Hw. 5 In, Hw. 6 Out
Feb. 22 Robotics, Human/Computer Interaction, Personal Health Informatics Vote for one of [19], [20], [21] Hw. 6 In, Hw. 7 Out
Mar. 1 Theory and Algorithms Required (short) reading: [22]
Voting on: [23], [24], [25], [26]
Hw. 7 In, Hw. 8 Out
Mar. 8 Spring Break
Mar. 15 Ethics in research [27], [28], [29] (Optional supplementary reading) Hw. 8 In, Hw. 9 Out
Mar. 22 Presentations [30], [31], [32], [33]
Mar. 29 Research in practice: Panel discussion with Ph.D. students Hw. 9 In, Hw. 10 out
Apr. 5 Presentations [34], [35], [36], [37], [38]
Apr. 12 Presentations, Wrap up [39], [40], [41], [42] Hw. 10 due Monday 4/15(on Blackboard)

Readings

  1. Peter J. Denning, Is Computer Science Science?. In Communications of ACM, April 2005.
  2. Dean Keith Simonton, After Einstein: Scientific genius is extinct, Nature 493, 602 (31 January 2013)
  3. Philip W.L. Fong. 2009. Reading a computer science research paper. SIGCSE Bull. 41, 2 (June 2009), 138-140.
  4. Keshav. 2007. How to read a paper. SIGCOMM Comput. Commun. Rev. 37, 3 (July 2007), 83-84.
  5. M. Ernst. How to write a technical paper., Last updated November 2018.
  6. David A. Patterson, Garth Gibson, and Randy H. Katz. 1988. A case for redundant arrays of inexpensive disks (RAID). SIGMOD Rec. 17, 3 (June 1988), 109-116.
  7. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, In Proceedings of ACM SIGCOMM, August 2001.
  8. Alma Whitten and J. D. Tygar. 1999. Why Johnny can't encrypt: a usability evaluation of PGP 5.0. In Proceedings of USENIX Security.
  9. Zakir Durumeric, Frank Li, James Kasten, Johanna Amann, Jethro Beekman, Mathias Payer, Nicolas Weaver, David Adrian, Vern Paxson, Michael Bailey, and J. Alex Halderman. 2014. The Matter of Heartbleed. In Proceedings of the Internet Measurement Conference (IMC), 2014.
  10. V. Paxson, End-to-End Routing Behavior in the Internet. In Proceedings of ACM SIGCOMM, August 1996.
  11. Breiman, Leo. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statist. Sci. 16 (2001), no. 3, 199--231.
  12. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. and Dieleman, S., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), p.484.
  13. Jeffrey Dean and Sanjay Ghemawa. MapReduce: Simplified Data Processing on Large Clusters. ixth USENIX Symposium on Operating System Design and Implementation, San Francisco, CA, December 2004.
  14. Patrick Pantel and Dekang Lin. 2002. Discovering word senses from text. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '02). ACM, New York, NY, USA, 613-619.
  15. Emery D. Berger and Benjamin G. Zorn. 2006. DieHard: probabilistic memory safety for unsafe languages. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '06).
  16. Pei Wang, Qinkun Bao, Li Wang, Shuai Wang, Zhaofeng Chen, Tao Wei, and Dinghao Wu. 2018. Software protection on the go: a large-scale empirical study on mobile app obfuscation. In Proc. the International Conference on Software Engineering (ICSE '18).
  17. Xavier Leroy. Formal certification of a compiler back-end or: programming a compiler with a proof assistant. In Symposium on Principles of Programming Languages (POPL '06), 2006.
  18. Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia, and Denys Poshyvanyk. When and why your code starts to smell bad. In Proceedings of the International Conference on Software Engineering (ICSE) 2015.
  19. L. Pinto and A. Gupta, Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours, 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016.
  20. Patrick Gage Kelley, Lorrie Faith Cranor, and Norman Sadeh. 2013. Privacy as part of the app decision-making process. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13).
  21. Janne van Kollenburg, Sander Bogers, Heleen Rutjes, Eva Deckers, Joep Frens, and Caroline Hummels. 2018. Exploring the Value of Parent Tracked Baby Data in Interactions with Healthcare Professionals: A Data-Enabled Design Exploration. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18).
  22. Tim Roughgarden, Reading in Algorithms, Paper-Reading Survival Kit.
  23. Anna R. Karlin, Shayan Oveis Gharan, and Robbie Weber. 2018. A simply exponential upper bound on the maximum number of stable matchings. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2018).
  24. Jonathan Katz and Luca Trevisan. 2000. On the efficiency of local decoding procedures for error-correcting codes. In Proceedings of the thirty-second annual ACM symposium on Theory of computing (STOC '00).
  25. Ittay Eyal, Emin Gun Sirer. Majority is not Enough: Bitcoin Mining is Vulnerable. arXiv 2013.
  26. Manindra Agrawal, Neeraj Kayal, Nitin Saxena. PRIMES is in P, Annals of Mathematics, Volume 160, 2004.
  27. The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research, August 2012.
  28. Mark Alllman and Vern Paxson. 2007. Issues and etiquette concerning use of shared measurement data. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement (IMC '07). ACM, New York, NY, USA, 135-140.
  29. Sam Burnett and Nick Feamster. Encore: Lightweight Measurement of Web Censorship with Cross-Origin Requests. In Proceedings of SIGCOMM 2015, August 2015.
  30. Speicher, T., Ali, M., Venkatadri, G., Ribeiro, F.N., Arvanitakis, G., Benevenuto, F., Gummadi, K.P., Loiseau, P. & Mislove, A.. (2018). Potential for Discrimination in Online Targeted Advertising. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, in PMLR 81:5-19.
  31. Ruiyu Yang, Yuxiang Jiang, Scott Mathews, Elizabeth A. Housworth, Matthew W. Hahn, Predrag Radivojac, A new class of metrics for learning on real-valued and structured data, Arxiv
  32. P. Luo, K. Athanasiou, Y. Fei and T. Wahl, Algebraic Fault Analysis of SHA-3 Under Relaxed Fault Models, in IEEE Transactions on Information Forensics and Security, vol. 13, no. 7, pp. 1752-1761, July 2018.
  33. Le Chen, Alan Mislove, and Christo Wilson. 2015. Peeking Beneath the Hood of Uber. In Proceedings of the 2015 Internet Measurement Conference (IMC '15).
  34. Ryan Culpepper and Matthias Felleisen. 2010. Fortifying macros. SIGPLAN Not. 45, 9 (September 2010), 235-246.
  35. Ruiyang Xu and Karl Lieberherr. Learning Self-Game-Play Agents for Combinatorial Optimization Problems. To appear in AAMAS 2019.
  36. Xinyu Hua, Lu Wang. Neural Argument Generation Augmented with Externally Retrieved Evidence.
  37. Jan Camenisch, Rafik Chaabouni, abhi shelat. Efficient Protocols for Set Membership and Range Proofs. In ASIACRYPT 2008.
  38. Ronald E. Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson. 2018. Auditing Partisan Audience Bias within Google Search. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 148 (November 2018).
  39. Herman Saksono, Carmen Castaneda-Sceppa, Jessica Hoffman, Magy Seif El-Nasr, Vivien Morris, and Andrea G. Parker. 2018. Family Health Promotion in Low-SES Neighborhoods: A Two-Month Study of Wearable Activity Tracking. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18).
  40. Jingjing Liu, Carla E. Brodley, Brian C. Healy, and Tanuja Chitnis. 2013. Removing Confounding Factors via Constraint Based Clustering: An Application to Finding Homogeneous Groups of Multiple Sclerosis Patients. In Proceedings of the 2013 IEEE International Conference on Healthcare Informatics (ICHI '13).
  41. Erik Andersen, Yun-En Liu, Richard Snider, Roy Szeto, Seth Cooper, and Zoran Popović. 2011. On the harmfulness of secondary game objectives. In Proceedings of the 6th International Conference on Foundations of Digital Games (FDG '11).
  42. Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L.S. Wong, and Stefanie Tellex (2017). Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities. In Robotics: Science and Systems.

Textbook

There is no textbook for this course.

Papers readings and discussions

Leading the discussion involves two tasks. First, you will make an approximately 10 minute presentation that describes the motivation, goals, and results of the paper. Second, after this presentation and for the remainder of the allocated time slot, you will lead the discussion on the paper. It is you just to ensure a lively atmosphere for discussion, while being careful to stay on the topic of the paper. I will let students sign up to lead the discussion for the papers they choose.

If you are an audience member, you are expected to have read the paper and to participate in the discussion. This is a seminar course, meaning that the point of the course is to have a discussion. Not participating in this part of the course is not an option. The list of the papers will be available on the course website.

Homeworks

This course will have weekly homework assignments reviewing readings and concepts that are discussed in each class. Homework assignments are to be done by each student individually. The homework assignments will be graded and handed back to you within a week.
Homework assignments are due at the beginning of lecture on the specified date. Slip days may be used on the homeworks; absent any use of slip days, homework will be marked 20 points off per day that they are late, up to 2 days.

Final presentations

Each student will form a team of two to identify a research area and recent paper authored by a current faculty member at Northeastern. Groups must discuss their paper choice with the instructor or TA to get approval to proceed. The students must read the selected research paper to understand the topic and ask questions pertaining to the research questions, approach, findings, and future work. They must then meet with the faculty member (or postdoc/Ph.D. student if the faculty member is unavailable), and discuss these questions in addition to discussing opportunities to contribute to a research project. At the end of the course, each team will make a brief (5-10 minute) presentation about the research paper and their experience meeting with the faculty member (or alternate).

Exams

There are no exams in this course.

Participation

You are expected to attend each class meeting, as attendance and active participation during meetings comprises a large fraction of your grade. If you must miss class (e.g., you are ill or have some other obligation), please contact the instructor to discuss how to make up the time missed in class.

Grading

The breakdown of the grades in this course is:

Each homework will include a breakdown and description of how it will be graded.

Any requests for grade changes or regrading must be made within 7 days of when the work was returned. To ask for a regrade, attach to your work a page that specifies (a) the problem or problems you want to be regraded, and (b) for each of these problems, why do you think the problem was misgraded.

To calculate final grades, I simply sum up the points obtained by each student (the points will sum up to some number x out of 100) and then use the following scale to determine the letter grade: [0-60] F, [60-62] D-, [63-66] D, [67-69] D+, [70-72] C-, [73-76] C, [77-79] C+, [80-82] B-, [83-86] B, [87-89] B+, [90-92] A-, [93-100] A. I do not curve the grades in any way. All fractions will be rounded up.

Cheating Policy

It's ok to ask your peers about the concepts, algorithms, or approaches needed to do the assignments. We encourage you to do so; both giving and taking advice will help you to learn. However, what you turn in must be your own, or for projects, your group's own work. Looking at or copying code or homework solutions from other people or the Web is strictly prohibited. In particular, looking at other solutions (e.g., from other groups or prior CS 3950 students) is a direct violation. Projects must be entirely the work of the students turning them in, i.e. you and your group members. If you have any questions about using a particular resource, ask the course staff or post a question to the class forum.

All students are subject to the Northeastern University's Academic Integrity Policy. Per CCIS policy, all cases of suspected plagiarism or other academic dishonesty must be referred to the Office of Student Conduct and Conflict Resolution (OSCCR). This may result is deferred suspension, suspension, or expulsion from the university.

Accomodations for Students with Disabilities

If you have a disability-related need for reasonable academic accommodations in this course and have not yet met with a Disability Specialist, please visit www.northeastern.edu/drc and follow the outlined procedure to request services. If the Disability Resource Center has formally approved you for an academic accommodation in this class, please present the instructor with your "Professor Notification Letter" at your earliest convenience, so that we can address your specific needs as early as possible.

Title IX

Title IX makes it clear that violence and harassment based on sex and gender are Civil Rights offenses subject to the same kinds of accountability and the same kinds of support applied to offenses against other protected categories such as race, national origin, etc. If you or someone you know has been harassed or assaulted, you can find the appropriate resources here.