WakeOfAshesPosts: 21,665destroyer of motherfuckers
I thought it was well written. Honestly I wish I was more interested in the subject matter to have a longer discussion about it. I could add to this thread and post my last paper, however I dont think there would be much interest there because it is a discussion on mathematics.
WakeOfAshesPosts: 21,665destroyer of motherfuckers
im not really sure I can even post it. It is in PDF format because it has a lot of charts and figures. I guess I will try but no one will be interested. Im not even all that interested in the assignment.
WakeOfAshesPosts: 21,665destroyer of motherfuckers
Statistics and Covariance
The purpose of this project was to compare how closely one type of test score correlates to other types of test scores. Figure 1 lists the raw data of the class test score which includes a column for scores on Quizzes, Exams, and the Final. It is assumed that a perfect score in each case would be 100.
(EDIT - not showing figure of data)
To see how closely two different test type correlate, we need to measure how the student performance compares between each set of Quizzes, Exams, and Finals. To allow for differences in difficulty, each score needs to be adjusted so that each test type has a mean of zero. This is done by taking each students score and subtracting the average score (listed in figure 2) from each test type score. This translated Matrix Score is shown in Figure 3.
(EDIT - remove figure 2 and 3 because there are also boring)
These column vectors of the translated matrix X represent the deviations from the mean for each of the three test type scores. These three columns of X all sum to zero and have a mean of zero. To compare two sets of scores, the cosine of the angle between the two corresponding columns would need to be computed. Since this project is correlating Quizzes to Exams, Quizzes to Finals, and Exams to Finals, three different cosines need to be calculated (See Figure 4 for these calculations and results). A cosine value near 1 indicated that the two sets of test scores have a high degree of correlation. As you can see from the cosine values in Figure 4, these scores do not have a good correlation.
If the corresponding coordinates of X1 and X2, X1 and X3, X2 and X3 were paired off, then each ordered pair would like on the line y=alpha*x, where alpha is the slope. Since I found that the correlation was not good, these data points when plotted would not be close to the slope. See Figure 5 for the calculation of the slope and Figure 6 for a plot of the slope with the data points.
(EDIT - remove figure 5 because there are also boring)
This choice of slope gives an optimum least squares fit to these data points that are really not correlated very well. Figure 6-8 show these data points with this best least squares fit slope.
Thus Correlation matrix can then be represented as C=U’*U and each entry in the correlation matrix is a correlation between the i-th and j-th sets of scores. It is important to note that the scores are not all positively correlated since the correlation coefficients were not all positive. When a negative value appears in the correlation matrix, that means those two values are negatively correlated. Although no entries were exactly zero, if they were zero then that would mean that they were uncorrelated.
The Covariance matrix is another statically important matrix that can be applied to these grades matrix X. The covariance matrix is found by computing the mean of the data points and then dividing by the number of data points -1. See figure 12 for this equation and figure 13 doe the covariance Matrix S for X (figure 3) It is important to note here that the Diagonal entries of S are the variances for the three sets of scores, and the off-diagonals entries are the covariances.
Conclusions Correlation
Often times a Correlation and Covariance Matrix can be extremely useful in predicting trends or understanding how performance on different items relate. In this example one might want to conclude that performances on quizzes are indicative of performances on exams or finals. One might examine homework and see if good homework grades correlate to good exam grades. Unfortunately with the data sample given for this project, the quizzes, exams, and finals do not have much correlation. One student’s performance on quizzes doesn’t translate over to their performance on finals. One could attempt to some meaning in this loosely correlated data, however in this case I would simply state that the performance of a student on quizzes, exams, or finals is based on intangibles that are simply not represented in the data set. I would like to conclude with stating that even if we had received a high correlation with this data set, correlation does not imply causation.
I might have understood the conclusion. ) So, where'd you get you data? Did you poll a class with their overall grades? I find that stuff interesting, but I have no idea how to go about doing it. I got as far as Algebra 2.
WakeOfAshesPosts: 21,665destroyer of motherfuckers
The data was just dummy data that the professor made up for the sake of the project. That is mostly why I was completely uninterested in it. I am more interested in projects that actually deal with real problems, and not just some fairytale exercise. What I didn't post was the several pages of matlab code that produced all the results that the paper was discussing. That was really the meat of the assignment. Write some code, and then write a scholarly paper discussing the results.
WakeOfAshesPosts: 21,665destroyer of motherfuckers
I was hoping the final project would be a pick your own data to manipulate. My choice was going to be the New York State Lotty. I had some good ideas for data manipulation for helping to predict lotto numbers. It would have been double epic if I got an A on the project and won the lotto! )
WakeOfAshesPosts: 21,665destroyer of motherfuckers
yeah... im going to. however I'm sure I wont win the lotto. I'm sure this sort of thing has been done many times. I wish it was possible to purchase lotto tickets online. Then I could just write a script to run my program daily and then buy the lotto ticket for me. That sure would make it easy for me. have the problem with the lotto is remembering to buy one.
Comments
And some may be interested on math on here. Post it. That's what I created this thread for.
The purpose of this project was to compare how closely one type of test score correlates to other types of test scores. Figure 1 lists the raw data of the class test score which includes a column for scores on Quizzes, Exams, and the Final. It is assumed that a perfect score in each case would be 100.
(EDIT - not showing figure of data)
To see how closely two different test type correlate, we need to measure how the student performance compares between each set of Quizzes, Exams, and Finals. To allow for differences in difficulty, each score needs to be adjusted so that each test type has a mean of zero. This is done by taking each students score and subtracting the average score (listed in figure 2) from each test type score. This translated Matrix Score is shown in Figure 3.
(EDIT - remove figure 2 and 3 because there are also boring)
These column vectors of the translated matrix X represent the deviations from the mean for each of the three test type scores. These three columns of X all sum to zero and have a mean of zero. To compare two sets of scores, the cosine of the angle between the two corresponding columns would need to be computed. Since this project is correlating Quizzes to Exams, Quizzes to Finals, and Exams to Finals, three different cosines need to be calculated (See Figure 4 for these calculations and results). A cosine value near 1 indicated that the two sets of test scores have a high degree of correlation. As you can see from the cosine values in Figure 4, these scores do not have a good correlation.
If the corresponding coordinates of X1 and X2, X1 and X3, X2 and X3 were paired off, then each ordered pair would like on the line y=alpha*x, where alpha is the slope. Since I found that the correlation was not good, these data points when plotted would not be close to the slope. See Figure 5 for the calculation of the slope and Figure 6 for a plot of the slope with the data points.
(EDIT - remove figure 5 because there are also boring)
This choice of slope gives an optimum least squares fit to these data points that are really not correlated very well. Figure 6-8 show these data points with this best least squares fit slope.
Thus Correlation matrix can then be represented as C=U’*U and each entry in the correlation matrix is a correlation between the i-th and j-th sets of scores. It is important to note that the scores are not all positively correlated since the correlation coefficients were not all positive. When a negative value appears in the correlation matrix, that means those two values are negatively correlated. Although no entries were exactly zero, if they were zero then that would mean that they were uncorrelated.
The Covariance matrix is another statically important matrix that can be applied to these grades matrix X. The covariance matrix is found by computing the mean of the data points and then dividing by the number of data points -1. See figure 12 for this equation and figure 13 doe the covariance Matrix S for X (figure 3) It is important to note here that the Diagonal entries of S are the variances for the three sets of scores, and the off-diagonals entries are the covariances.
Conclusions Correlation
Often times a Correlation and Covariance Matrix can be extremely useful in predicting trends or understanding how performance on different items relate. In this example one might want to conclude that performances on quizzes are indicative of performances on exams or finals. One might examine homework and see if good homework grades correlate to good exam grades. Unfortunately with the data sample given for this project, the quizzes, exams, and finals do not have much correlation. One student’s performance on quizzes doesn’t translate over to their performance on finals. One could attempt to some meaning in this loosely correlated data, however in this case I would simply state that the performance of a student on quizzes, exams, or finals is based on intangibles that are simply not represented in the data set. I would like to conclude with stating that even if we had received a high correlation with this data set, correlation does not imply causation.
"even if we had received a high correlation with this data set, correlation does not imply causation."
I might have understood the conclusion. ) So, where'd you get you data? Did you poll a class with their overall grades? I find that stuff interesting, but I have no idea how to go about doing it. I got as far as Algebra 2.
....but he canceled the last project (
Eek. Picking my classes for next semester. :-SS