Sunday, December 22, 2019

A review of my CMSC 131 class offering



Last semester I taught, for the first time, the Introduction to Computer Organization and Machine Level Programming course for undergrads. Traditionally this course mainly teaches students x86 (32-bit) assembly language programming in an Ubuntu environment using NASM. I decided to "upgrade" the course by directing the course towards systems programming and introducing x86-64 assembly language. My main reason for the upgrade is to prepare the students for their Operating Systems class which they will take later and which I also teach. In order to achieve this goal, I needed textbooks to guide me which led me to use the following:
CS:APP was created by professors from CMU and is widely used in many top universities. I used the slides provided by the book. But in my lectures, I have several open terminals to demonstrate the actual operation of the codes. ALPU is a good book also with simple exercises at the end of each chapter.

The topics (based on the CS:APP2e contents) I was able to cover in the lecture are the following:
  • A tour of computer systems
  • Representing and manipulating information
  • Machine-level representation of programs
  • Memory hierarchy
  • Linking

In the lab, I was able to cover most of the chapters in ALPU, excluding DDD, Macros, Stack Buffer Overflow, I/O Buffering, Floating Point, and Parallel Processing.

I introduced GDB in the lab to help students debug their programs. I also used GitHub Classroom for distributing boilerplate code and code submission.

In my lectures,  the emphasis is on how the GCC compiler translates C source code into assembly language and how the processor executes the machine code. I always have terminals open to fire up GCC and GDB during lectures.

For future improvement of the offering of this course, I recommend the following:
  • Use same syntax in both the lecture and lab. CS:APP2e uses AT&T syntax while ALPU uses Intel syntax.
  • CS:APP2e is still 32-bit with a little introduction to 64-bit. It may be possible to use the 3rd edition which I believe is purely 64-bit.
  • Most of the programming exercises do not accept input and produce output interactively. A GDB command/script file can be used to provide input and produce output by explicitly stating the variable names, memory addresses, or registers. Checking of programming exercises can be automated using GDB command scripts.
  • Introduce assembly language programming in Windows environments using Microsoft Native Build Tools.
  • Introduce ARM assembly language programming using an emulator or RPi.

I would like to thank Prof. Rizza DC. Mercado and Prof. Kendall Jaen for sharing their teaching materials. It was fun teaching this course and I learned some new topics in depth.

Saturday, December 21, 2019

How to determine research potential


As professors, we are often on the lookout for undergraduate students (or junior faculty) with research potential to join our research groups.  We want to encourage these students to pursue graduate studies in order to further advance the field by becoming researchers or professors in the future.

We base our evaluation initially on their grades from the courses they took. Our claim is that the higher the General Weighted Average (GWA) of a student, the higher the research potential.  

Research potential, as described in Costromina et. al. (2014), is a multidimensional and multilayered system of individual psychological traits (motivational, cognitive, and behavioral characteristics) that act together to allow an individual to conduct research activity efficiently and fruitfully.  These traits are described below:

Motivational

  • Intolerance for ambiguity - capacity to experience positive feelings in new, unstructured, and varied situations
  • Satisfaction in solving problems - capacity to feel gratification from the process of finding ways and means for coping with scientific tasks
  • Intellectual curiosity - the conscious desire to receive information about objects and to enjoy learning
  • Intolerance for novelty - reveals thirst for experimentation, innovations, etc.

Cognitive

  • Flexible thinking - can overcome conventional thinking
  • Critical thinking - capacity to reveal mistakes and inconsistency, to correct errors, justify the validity of hypothesis
  • Logical thinking - ability to use facts and laws to confirm the accuracy of conclusions promptly
  • Quick thinking - ability to understand the situation and to make decisions in a timely manner
  • Original thinking - capacity to propose new, unconventional ideas

Behavioral
  • Self-organization - structuring of a researcher's personal activity to reach objectives
  • Self-control - following research procedures and completing work tasks
  • Adaptability - reduces time necessary to accept the changing conditions of a research task
  • Assertiveness - maintaining stability while working in unstable conditions

Costromina et. al. (2014) conducted a study to compare undergraduates, master's students, and professors along the above dimensions. In their conclusion, they were able to obtain data on the high predictive validity of theoretical abilities in defining students' level of research potential.

At the undergraduate level, intolerance for novelty, self-control, adaptability, assertiveness and critical thinking are the characteristics that should have been developed, according to the study.

It seems, therefore, that the use of GWA as first screener of a student's research potential is valid. In addition however, a student should also be evaluated based on the traits described above.


Reference

Thursday, November 7, 2019

SRG Authorship Guiding Principles

This post aims to address authorship in research papers produced within our research group. I think that authorship guidelines should be discussed early in the research project or graduate study in order to avoid complications later. I've witnessed colleagues and other individuals who abuse authorship by giving 'honorary' authorships, excluding names from the list of authors, and changing authorship order. The work by Solomon, Programmers, Professors, and Parasites: Credit and Co-Authorship in Computer Science, provides a good discussion of this topic. It presents the following principles that I will adopt for our group.


1. Authorship credit should be distributed only to those researchers directly involved with the paper or project in question. Researchers with indirect or minimal involvement may be mentioned in an additional "acknowledgements" section if necessary. All contributors should appear on paper; "ghost writing" is an invalid way even for a busy researcher to produce publications.

- Should you include your adviser and your committee members?
- Should you include project leader, project staff, research assistants?
- Should you include your research group leader?
- Should you include ALL members of your research group?
- Should you include your special someone?

2. All authors should be paired with short descriptions of their contributions to the project. These descriptions need not be on the title page but should apparent for anybody seeking further information about the research presented. This principle extends to the acknowledgements list. In general, any individuals or organizations mentioned by the paper should be identified to avoid "honorary" authorship and make explicit the division of work leading to the final results.

3. The list of authors should be divided by level of contribution. Within each division, authors should be ordered by the amount they contributed to the particular paper in question. Truly equal co-authorship relationships should be marked as such, with none of the authors identified as a "corresponding" author. The lack of a single corresponding author can be addressed by creating a simple email that alias that contacts all the principal authors simultaneously. Those researchers who would be considered "inventors" should be marked as such for the purposes of verifying future patent applications.

4. Upon publication, authors should be required to sign that the work in the paper is at least partially their own and that no other authors should be given credit.

5. Any and all decisions involving authorship should involve the mutual consent of all authors, which should be established via individual contact.

6. Any discovered cases of authorship fraud should be dealt with in much the same way as data fabrication. Once they are caught, authors should be required to explain their incorrect practices in a published statement and rectify any disadvantages suffered by parties not receiving appropriate credit.


References:

[1] Solomon, Justin. (2009). Programmers, Professors, and Parasites: Credit and Co-Authorship in Computer Science. Science and engineering ethics. 15. 467-89. 10.1007/s11948-009-9119-4.

[2] Allison Gaffey. (2015). Determining and negotiating authorship. Retrieved October 6, 2019 from https://www.apa.org/science/about/psa/2015/06/determining-authorship.    



Wednesday, October 30, 2019

Using VBoxManage to run BioLinux, headless


1. Import the appliance

$ vboxmanage import bio-linux-8-latest.ova

2. Check if the VM was imported

$ vboxmanage list vms
$ vboxmanage showvminfo "Bio-Linux-8.0.7" | less

3. Modify the VM to use bridged network connection

$ vboxmanage modifyvm "Bio-Linux-8.0.7" --nic1 bridged --bridgeadapter1 eno1

4.  Start the VM in headless mode

$ vboxmanage startvm "Bio-Linux-8.0.7" --type headless

5. Check if the VM is running

$ vboxmanage list runningvms

6. Get the assigned IP address to the VM

$ vboxmanage guestproperty enumerate {`VBoxManage list runningvms | awk -F"{" '{print $2}'` | grep IP | awk -F"," '{print $2}' | awk '{print $2}'

7. Hard shutdown of the VM

$ vboxmanage controlvm "Bio-Linux-8.0.7" poweroff

8. Use SSH to connect to the VM

Sunday, October 13, 2019

DEC{}DE 2019: Gear UP Experience

We again attended this year's DEC{}DE event sponsored by Trend Micro. This is my third year attending the event (2018, 2017). The talks were really interesting especially the Keynote by Jay Yaneza.  I also liked the talk given by Jon Oliver about the role of Machine Learning in Cybersecurity where he emphasized that ML must be layered to existing security solutions. The hands-on session was on Powershell for the Blue Team.


(Photo from Trend Micro)


Sunday, October 6, 2019

Video: Basic Malware Analysis Workflow

The setup is using a Whonix Gateway VM and a Windows XP VM running in VirtualBox. Our objective is to capture the network traffic generated by malware. The malware is run on the Windows XP VM configured to use Whonix as the gateway.


Thursday, September 5, 2019

Introduction to debugging C programs using GDB


Instead of just reading the code, a debugger such as GDB, can be used to find errors in C programs. GDB is available in linux distributions.

Example code, prod.c :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <stdio.h>
#include <stdlib.h>

int mul(int x, int y){
   int prod;
   int i;

   prod=0;
   for (i=0;i<y;i++){
      prod=prod+x;
   }
   return prod;
}

int main(){
   int a=4;
   int b=3;

   printf("The product of %d and %d is %d\n",a,b,mul(a,b));
   
   return 0;

} 

The following are the typical activities when debugging C programs:

1.  Create the executable with debug information

$ gcc -g -o prod.exe prod.c

For assembly language programs:

$ nasm -g -F dwarf -felf64 prod.asm
$ ld -o prod.exe prod.o


2.  Load the program in GDB

$ gdb prod.exe

3. View the source code listing

(gdb) list
 
4. Set a breakpoint

(gdb) b * main

5. Execute until breakpoint

(gdb) r

6. Execute next line

(gdb) n

7. View current line being executed

(gdb) frame

8. Step into a function

(gdb) s

9. View local variables

(gdb) info locals

10. Print variables

(gdb) print a

11. Set new values for variables

(gdb) set variable a=5

12. Continue execution until next breakpoint

(gdb) c

13. Quit

(gdb) quit

Wednesday, July 31, 2019

Notes from "It's not how good you are, it's how good you want to be"

  • All creative people need something to rebel against, its what gives their lives excitement.
  • There is no instant solution, the only way to learn is through experience and mistakes.
  • Aim beyond what you are capable of.
  • People who are conventionally clever get jobs on their qualifications (past), not on their desire to succeed, (the future).
  • Do not seek praise. Seek criticism.
  • If you accept responsibility, you are in a position to do something about it.
  • If you give away everything you have (ideas), you are left with nothing. This forces you to look, to be aware, to replenish.
  • Don't look for the next opportunity. The one you have in hand is the opportunity.
  • Accentuate the positive. Eliminate the negative.
  • Do not put your cleverness in front of the communication.
  • Don't promise what you can't deliver.
  • Give your client what he wants and he may well give you what you want.
  • Don't take no for an answer. Work out a new approach.
  • When it can't be done, do it. If you don't do it, it doesn't exist.
  • If you can't solve a problem, its because you're playing by the rules.
  • The pesron who makes mistakes is unlikely to make anything.
  • It is wrong to be right, because people who are right are rooted in the past, rigid-minded, dull and smug.
  • It is right to be wrong. Being wrong isn't in the future, or in the past. 
  • Don't be afraid of silly ideas.
  • Play your cards right.
  • It's not what you know, it's who you know.
  • Don't give a speech. Put on a show.
  • Getting fired can be a positive career move.
  • Rough layouts sell the idea better than polished ones.
  • If you get stuck, draw with a different pen.
  • Don't be afraid to work with the best.
  • Do not try to win awards.

Saturday, July 20, 2019

How to solve problems

Some notes from the book How to Think Like a Mathematician by Kevin Houston. I found this chapter in the book interesting because I was able to relate to the  process when solving competitive programming problems.

Definitions
  • Exercise - something that can be solved by a routine method
  • Problem - something that will require more thought; will require the application of routine methods learned in exercises
"The best way to learn how to solve problems is to solve problems."

Sample Problems
  1. How many zeroes are at the end of 100! (100 factorial)
  2. Suppose that X and Y are two infinite sets. Find a formula that relates |X|, |Y|, | X intersection Y| and |X union Y|.
  3. Show that the equation x^2 + y^2 = z^n has positive integer solutions for every n = 1,2 ,3, ...


Polya's four-step plan

Understanding the problem
  1. Understand all the words and symbols in the problem - Know the meaning of the symbols as well as the important definitions and theorems
  2. Guess - use your intuition - Make an educated guess
  3. What do you know about the hypothesis and conclusion? - Write down what you know about the hypothesis and conclusion
  4. Work backwards and forwards - You can start with the conclusion and think what it would imply.
  5. Work with initial and special cases - Some problems have an index (n in P3). Solve for the initial cases to get a 'feel'.
  6. Work with a concrete case - For abstract problems, look at a concrete case. Create instantiations of the variables. (P2)
  7. Draw a picture - Venn diagrams (P2)
  8. Think about a similar problem - Recall problems you've solved before that may be related to the problem you are solving.
  9. Find an equivalent problem - Reformulate a problem, say to show that two functions are equal, they can be represented as a new function with difference of zero.
  10. Solve an easier problem - In P1, solve 10! first to get a feel.
  11. Rewrite in symbols or word -
Devising a plan
  1. Break the problem into pieces -
  2. Find the right level - there are many ways to approach a problem so select the right one
  3. Give things names - "Let X be ..."
  4. Systematically choose a method - A proving problem can be solved solve using different approaches.
Executing a plan
  1. Check each step - don't use intuition
Looking back
  1. Check the answer - Test.
  2. Find another solution
  3. Reflect - Think about what solved the problem.



Saturday, July 13, 2019

Internship at KAIST

This Midyear (June 24-August 16, 2019), I am a research intern at the Computer Architecture Laboratory of the School of Computing in KAIST under the supervision of Prof. Youngjin KwonKAIST ranks high in the areas of Systems that I am really interested so this internship is a very good opportunity for me. My teammate for the internship project is Jongyul Kim who is a PhD student in the lab. We are working on improving the performance of distributed storage systems using RDMA, programmable NICs, and NVM.


    Tuesday, May 28, 2019

    Sections I read in an SP paper

    A Special Problem (SP) paper is a form of documentation for capstone projects of students. It is in the format of a journal article. As a panel member, it is only during the SP presentation itself that I get a copy of the SP paper of the presenter. So given the limited amount of time to read the paper, I only read certain parts of it before the presentation starts.

    First I read the title to identify the general area(e.g. web/mobile app, ML, algorithms) of the project. Second,  I read the abstract to get an overview view of the work. Third I focus on the objectives to know the specifics of what should be achieved by the project. Lastly, the results and discussion to check whether the objectives were met.

    It is unfortunate that some advisers don't bother to read/edit the SP papers of their students before the presentation. Nonetheless, the four sections I mentioned above, I think, should be written with care.

    Saturday, May 4, 2019

    Programming Tips for Student Projects

    • Use git for version control. Follow this simple workflow model.
    • Create separate folders for frontend and backend components especially if using the MERN stack. You can also create a data folder inside the backend. An example application template.
    • Use config files to set values for database configuration such as dbhost, dbuser, dbpass, dbname
    • Create .sql files that contains initialization data and stored procedures.
    • Use relative URLs in your app.
    • Write an INSTALL text file that describes how to install your application. Indicate the dependencies (OS version, package names, version number). If possible, create an install.sh or setup.sh to automate the installation process.
    • Use a coding convention for naming variables, functions, methods, etc.
    • Do not store passwords in plaintext.
    • Learn Docker and Docker Compose and TravisCI.  Check my app template.
    • Write automated tests.
    • (more to follow)

    Thursday, April 4, 2019

    PCSC 2019 Experience

    This year's PCSC 2019 was held at the National University in Manila. According to the organizers, approximately 200 participants registered. There were three of us from UPLB. Although I did not present a paper, I enjoyed the presentations. Most are in the field of AI, ML and algorithms. It was good to see familiar faces in the computing research field in the country.  In the picture below is Dr. Roxas, the current president of the Computing Society of the Philippines delivering the opening remarks.


     

    Sunday, March 24, 2019

    Video: Data Link Layer Lecture

    Thursday, March 21, 2019

    Learning Windows 7 Internals



    I've been using Linux(Ubuntu distro) for a long time and somehow have a deeper understanding of its internals. I guess it is time for me to focus on Windows.


    Books
    • Windows Internals (Parts 1 and2), 6th Ed. by Russinovich et. al.
    Software

    Compiling Code
    • SetEnv.cmd /Debug /x86 /win7