Abstract

We worked with the Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set from the University of California, Irvine. We first did some preliminary analysis on the data and that the static activities have stagnant acceleration and gyroscope while the movement activities has significantly more activity. We then implemented three algorithms, the K-nearest neighbors algorithm, the multilayer perceptron (also known as a fully connected neural network), and a random forest classifier and compared their accuracy on cross-validation. We found that the K-nearest neighbors gave us the best accuracy with a single neighbor.

Link to Github Repository

Introduction:

Human Activity Recognition has become widely used and valued. But what is human activity recognition? The article entitled, “Human Activity Recognition in Artificial Intelligence Framework: A Narrative Review”, would define Human Activity Recognition (HAR) as “the art of identifying and naming activities using Artificial Intelligence (AI) from the gathered activity raw data by utilizing various sources (so-called devices)” (Guptna, N., Gupta, S.K., Pathal, R.K. et al.). In other words, HAR is a machine learning concept that determines what humans are doing any given point in time. This includes walking, running, sitting, standing, walking upstairs, biking, etc. In fact, HAR is used constantly. Iphones tell people how much they walk in a day, apple watches track the amount of physical activity people get, and HAR has found its way in healthcare, surveillance, and remote care to the elderly.

Literature Review:

Diverging from the usage of HAR, a big and current topic is how to implement HAR. Since 2006, researchers and computer scientists have implemented different algorithms to determine the best algorithm to use. In the article titled, High Accuracy Human Activity Recognition Using Machine Learning and Wearable Devices’ Raw Signals, the authors explained the history of HAR research and the different algorithms used. First in 2006, Pirttijangas et al. “tested a model that used several multilayer perceptrons and k-nearest neighbors algorithms to recognize 17 activities to achieve an overall accuracy of 90.61%” (Papaleonidas, Psathas, and Iliadis). In 2011, Casale et al. “used a wearable device and applied a random forest classification algorithm to model five distinct activities (walking, climbing stairs, talking to a person, standing, and working on the computer)”, which achieved a 90% accuracy (Papaleonidas, Psathas, and Iliadis). In 2018, Brophy et al. “proposed a hybrid convolutional neural network and an SVM model with an accuracy of 92.3% for four activities (walking and running on a treadmill, low and high resistance bike exercise)” (Papaleonidas, Psathas, and Iliadis).

Values Statement

Our algorithm helps identify a task that someone is doing. It would be used in a smart watch or on our phone to track our health. How often do you stand up during the day? How much do you walk a day? It could provide useful information to create a better lifestyle. For example, within an app, the algorithm would track how much someone walks during the day. Assuming that person checks that app, they may notice that they don’t walk around as much as they should and would influence them into exercising more. It would create a better lifestyle for that person, which would help them in the long run.

There could be some harm done if the device using the algorithm is not accurate. For example, if the algorithm is not accurate, it could tell someone that they are walking when they are actually sitting down. This could lead to someone thinking that they are getting more exercise than they actually are. This could lead to someone not exercising as much as they should. This could lead to health problems in the future. Outside of health, another potential harm is if the algorithm is fed biased data. For example, if the device is used to determine if someone is walking or running, it could be biased against people with physical disabilities.

I find Human Activity Recognition to be an extremely interesting topic and direction for technology to evolve in. I believe that it is important to understand how our bodies move and how we can use that information to improve our lives.

For this reason, I believe that implementing this will help people achieve a better lifestyle, which makes the world a more joyful place.

Materials and Methods

We used the Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set from the University of California, Irvine. The data set consists of 30 volunteers between the ages of 19-48 who were asked to do 6 basic activities: sitting, walking, walking upstairs, walking downstairs, laying down, and standing. According to the authors of this data set, “all the participants were wearing a smartphone (Samsung Galaxy S II) on the waist during the experiment execution.” They “captured 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz using the embedded accelerometer and gyroscope of the device.” And finally, they edited the data set by applying noise filters, by using a Butterworth low-pass filter.

Our algorithm is a relatively objective algorithm compared to other algorithms created by large corporations. Despite that, the algorithm may exclude people with disabilities. Looking at the data set, it seems as if those included in the data set are people without any physical disabilities. So if we were to use our algorithm in the real world, our algorithm wouldn’t take into account people with a physical disability, creating bias and unfairness. Therefore, this could disproportionally hurt people with disabilities and help able-bodied people.

Our features were the total acceleration, body acceleration, and body gyroscope on the three axes. Our targets were the 6 activities labeled as numbers from 1 to 6. We subset 70% of our data to be our training group and the remaining 30% our testing group. We used the same train-test split in our 3 algorithms and compared the accuracy of the cross-validation.

First for the K-nearest neighbors algorithm, we ran it 10 times with different numbers of neighbors to find the one with the best accuracy. Then for the the multilayer perceptron, we chose to run it 10 times and take the average accuracy because of randomization in the neural network. Finally, for the random forest classifier, we ran it 10 times to account for randomness and averaged the accuracy.

Results

Algorithm	Accuracy
K-Nearest Neighbors	89.8%
Multilayer Perceptron	85%
Random Forest Classifier	85%

We found that the K-nearest neighbors gave us the best accuracy of 89.8% with a single neighbor while the other two algorithms did slightly worse with an accuracy of 85%.

Concluding Discussion

Our goal for this project was to perform “an exploratory analysis along with visualizations of a variety of variables such as total acceleration, body acceleration, and body gyroscope in all the types of activies across time.” Followed by “using that knowledge to use a few predictive models on the data to predict the activity type based on the value of the variables.” We also said that we would submit a well-documented and clean code base. Looking at what we have accomplished, I can confidently confirm that we have achieved our goal. We have performed an exploratory analysis and visualized the data. We have also used a few predictive models on the data to predict the activity type with decent results. We have also submitted a well-documented and clean code base.

Although our results are slightly inferior to others who have studied similar problems, we had different data sets and different goals. We were also limited by time and resources. We could have improved our results by using more data and more advanced algorithms. We could have also improved our results by using more advanced techniques to clean our data.

Group Contributions Statement

Kaylynn started to work on the project by writing the functions to import the data. Unfortunately she was having problems with that, so I helped her fix it. We then worked all together on the first part of the preliminary analysis. I then did the second part of the preliminary analysis while Mead and Kaylynn implemented the perceptron. I was working on commenting and documenting the code while Kaylynn implemented K-nearest neighbors algorithm and the random forest classifier. Mead made the algortihms loop 10 times to take the average accuracy. I then went back through the code and cleaned up most of it while Mead and Kaylynn worked on the presentation. We had somehow read the data 7 times, so I cleaned a lot and also left comments explaining code and edits to some text. Kaylynn and Mead went back through the document and also made some edits. We have not worked on the blog post together, so we each have a separate one.

Personal Reflection

I learned how to use functions to import and subset data, run some exploratory analysis, and implement some algorithms. I gained more experience using the pandas, numpy, matplotlib, and sklearn libraries. I reinforced my stance on open and strong communication when working in a group. I also familiarized myself with Jupyter Notebooks and the Markdown language.

My initial goals were to ensure that our group will meet often to discuss the progress on the project and make sure that we are all on the same page, make sure that we are all communicating with each other and submitting our milestones on time and that we are all working on the project. Also that I was going to reinforce my knowledge on the pandas, sklearn, and matplotlibs libraries, as well as any other necessary libraries. At the end, I would revise the project report in response to feedback. Looking at all of them, I can confidently say that I essentially achieved all my goals save for the last one.

Since I am going to graduate school for computer science, and since I am interested in the field of human–computer interaction, I can see myself doing research or taking a class related to this topic.