Interests: Mathematics, Computer Science, Tae-kwon-do, and Guitar

# New Blog Location

All future content, course notes, and projects will now be hosted on my new blogging website wwkong.github.io.

I will still leave the majority of the content on my WordPress blog intact for anyone who wants to use it for reference, but please note that my course notes have been moved to the new site.

# Gender Equality in STEM Programs

Recently I’ve been interested in investigating how gender equality – or equivalently, inequality – has evolved over time in Canada. Using the University of Waterloo’s public Cognos cubes, specifically those for undergraduate enrollment, I have found some pretty interesting results. Below, I will detail a brief summary of these findings.

To begin, let’s talk a bit more about our population of study and the data that is used in our analysis. The target population that I’m examining here is set of all undergraduate university students and the sample population is the set of all undergraduate students who have enrolled at the University of Waterloo. The sample chosen in the analysis is the sample population restricted to students who have enrolled between 1996 to 2013 where terms are only selected in the sample if there are at least 20 distinct programs with at least 1 student in them. We make no distinction between students enrolled in or not enrolled in a co-operative program.

All analysis and visualization is done in the free academic version of Revolution R, Version 7.

For each date, at the Program (e.g. Life Sciences) and Faculty (e.g. Science) level, a total is computed for each of the female and male genders and the percentage of females is then calculated as |Females|/(|Males| + |Females|). A rendered ordered bar chart at each date, with the Program on the x-axis and percentage female on the y-axis, is then generated using the ggplot2 R package and a GIF animation of these charts is produced to study the time evolution as seen below.

Bar colors are dependent on the Faculty of the Program. Click the image above to properly view the animation.

The abbreviations for each Program can be clarified here. A quick scan over the image shows that there does not appear to be any noticeable change in the overall shape other than a -very- slight flattening of the center bars and slight increase in the slopes near the extreme ends during later years.

Using this data, I use the following method as a crude estimate for Faculty-wide, time dependent gender bias, where I define this as how gender bias a Faculty is relative to past or future states or enrollments of the university. Suppose that for a fixed date we have $n$ programs and $P=\{P_{1,F1}, P_{2,F2}, ..., P_{n,Fn}\}$ is a set of ordered values of percentages of females in $n$ different programs, ordered by least to greatest percentage of females in the first index, and where the second index is representative of the Faculty in which the program falls under. Let $P_{F}=\{P_{k,Fk} \in P: Fk = F\}$ and $n_{F} = | P_{F} |$. Then for each Faculty $F$, we denote the (female-dominated) gender bias as

$G(F)= (P_{n_{F},Fn_{F}}+P_{n_{F-1},Fn_{F-1}}+P_{n_{F-2},Fn_{F-2}}) / 3n$

Which we can think of as a three term average [1] of the quantile of the three most female dominated programs. A value close to 100% (less biased) is generally preferred.

Taking only the STEM Faculties into consideration (SCI, ENG, MATH), we plot out this measure over time using the lattice R package below:

The blue circles indicate points in time, the red lines are LOESS curves and the green lines are smoothing splines. The science faculty seems to follow a rather sinusoidal trend, the engineering faculty a mostly linear trend, except for the sudden rise in the 2003-2005 date range, and the maths faculty being the most sporadic of the three. There is an apparent outlier near the 2008 year in the maths faculty, although this may be explained by the increased interest in the new FARM program and other finance related programs in light of the latest U.S. recession.

A least squares regression with slope and intercept interaction factors is also done in R for computing long term trends and is shown below:

Here, Idx is just a normalized Date variable. From the results, we can see that the long-run growth in the MATH and SCI Faculties are not significantly different from one another and we can expect a long-term growth of female gender bias of approximately 0.23% every term in these faculties in the near future, while for engineering, this is closer to 0.03%.

With this in mind, it looks like we won’t be seeing fair gender equality for at least 2 decades for the sciences and several times that amount for the mathematics and engineering faculties.

To replicate these results, as well as see the charts above in higher resolution and examine the source data, you can check out the relevant Skydrive directory here.

If you have any comments or suggestions for future statistical projects, let me know in the comments section below.

[1] An average is done here in order to smooth out any outliers, which from the data we can see a few, particularly in the architecture program.

# Another End of Term Post

Hi everybody,

Just like my other end of term posts, I’ll start with the same old, same old. School has been rough, interview cycles have been hectic, and plans for graduation are looming over the horizon, so naturally blogging has been on the bottom of my priority list.

However, today I have been fairly productive in cleaning up the Course Notes section of this blog. Instead of the notes being maintained on my Waterloo Linux account, I’ve decided to store all relevant files on my SkyDrive account with encrypted links included in that section of the blog (the old files will still be on my Linux account, but will be outdated as of today). I’ve also taken the entirety of today to update all the details of my Spring 2013 courses which include the abstracts, indices, and references of each course so to make your reading and learning experience a bit easier.

This term I don’t plan to be taking down any PMATH 451 notes, even though I will be taking it with Prof. Forrest. The reason being that online courses generally come with enough reference material that I won’t need to. I may decide to write a midterm or final exam review sheet though.

Hopefully, this week I’ll have enough post-exam creative juices within me to write up a few quality articles before I’m off to my internship on the 26th.

# Spring 2013 Course Notes

Hi everybody,

As I move into the edge of third year as an undergraduate student, the time that I have to actively contribute to this blog lessen more and more. However, with my new investment into a new tablet/laptop (that can last more than 1 hour) in the form of the latest Microsoft Surface Pro, I hope to make it up to everyone out there by offering not one, not two, but SIX sets of course notes this term.

Specifically I will be covering PMATH 450, PMATH 352, ACTSC 372, ACTSC 445, STAT 371 and STAT 330. I have already posted the most up to date versions of these notes in the Course Notes section of my blog and will continue to update them throughout the term, along with review notes for midterms and finals. Note that currently I am emphasizing the content of these notes rather than the aesthetics, so some areas such as the index and abstract are still under construction.

Hopefully I will have some time to present something of interest from my studies as the term goes on, but for now, all of you who are dying to know more about the details the non-measurability of transforms of $\mathbb R \backslash \mathbb Q$ will have to make do with my notes.

Until next time,

Stochastic Seeker

# Fall 2012 Exam Notes

Final exam review sheets have been posted for STAT 333 and PMATH 351 in the Course Notes section of this blog. I will update this post when more notes become available.

Update 1: The review sheet for ACTSC 371 is now available.

Update 2: The review sheet for CS 338 is now available.

# Fall 2012 Quick Update

Unfortunately for this term, I have decided not to typeset any notes including PMATH 351 and STAT 333. This is mainly because detailed instructor note have been posted on UW Learn and because I have responsibilities with Jobmine, my six upper year courses, and my independent studies towards the PRM designation.

I will, however, be typesetting review packages for midterms and finals of some of the courses that I am taking this term, so look forward to that.

When I have more time and less working, I can hopefully get back to blogging more entries.

# Some end of term work

Just about one week after my exams, I went to work on tidying up course notes and making sure that errata such as final exam study notes were polished enough to put on the blog. They’re now ready in the Course Notes section of this blog along with a bonus set of notes based on the multivariable calculus video series here.

Taught by Prof. Auroux, I decided to focus on only the vector calculus component of the lectures as the other components are well covered by my MATH 247 notes (also found on the same blog page). Specifically these notes cover the following:

In addition to these notes, I also completed one of my little side projects which is a basic proof on the irrationality of Euler’s Number, e. Check it out in the Resources section.

Over the next couple months or so, I will also be preparing to write the Putnam exam in December, so stay tuned for future interesting problems.

For undergraduates who are taking the summer off, I hope you all have a nice vacation and a good rest for the upcoming fall term.

For co-op students like me who are back to work next week, stay diligent and strong. Your efforts will appreciate someday.