机器学习中的人类活动识别

2025年03月17日 | 阅读 9 分钟

人类活动识别（Human Activity Recognition, HAR）是计算机视觉和人机交互领域中一个充满前景的研究方向。它旨在开发能够基于传感器数据自动识别和分类人类行为的系统和技术。

人类活动检测在普适计算、人际通信和人类行为分析中已变得至关重要。HAR的广泛应用促进了人类的安全与福祉。可以利用追踪身体活动、心率和睡眠质量的可穿戴设备来监测个人健康。在智能家居中，基于HAR的解决方案通过识别人员进出房间并相应调节灯光或温度，实现了节能和个人舒适。个人安全设备可以自动通知紧急服务或选定联系人。而这仅仅是冰山一角。

为了更好地理解，我们将构建一个模型，尝试识别人们正在进行的活动。

关于数据集

数据集中，人类活动被分为15个类别。该集合包含约12000多张带标签的图像，包括验证图像。每张照片都对应一个单一的人类活动类别，并存储在为每个已识别类别单独设置的文件夹中。

csv：它包含了将用于训练您模型的所有照片。这个文件夹中有15个子文件夹：'calling'（打电话）、'clapping'（鼓掌）、'cycling'（骑行）、'dancing'（跳舞）、'drinking'（喝水）、'eating'（吃饭）、'fighting'（打架）、'hugging'（拥抱）、'laughing'（大笑）、'listening_to_music'（听音乐）、'running'（跑步）、'sitting'（坐着）、'sleeping'（睡觉）、'texting'（发短信）、'using_laptop'（使用笔记本电脑）。
csv：包含了5400张人类活动照片。您需要为这些照片预测以下类别名称：'calling'（打电话）、'clapping'（鼓掌）、'cycling'（骑行）、'dancing'（跳舞）、'drinking'（喝水）、'eating'（吃饭）、'fighting'（打架）、'hugging'（拥抱）、'laughing'（大笑）、'listening_to_music'（听音乐）、'running'（跑步）、'sitting'（坐着）、'sleeping'（睡觉）、'texting'（发短信）、'using_laptop'（使用笔记本电脑）。
csv：这是将上传到平台的每张图像的预测顺序。请确保您下载的预测结果中，其图像文件名的顺序与此文件中指定的顺序相同。

问题陈述： 人类活动识别（HAR）旨在理解人类行为并对每种行为进行分类。它有着广泛的应用，因此在计算机视觉领域越来越受到关注。人类活动可以通过多种数据模态来表示，包括RGB、骨骼、深度、红外、点云、事件流、音频、加速度、雷达和WiFi信号，这些模态编码了不同来源的有用但截然不同的信息，并根据应用场景具有不同的优势。因此，一些现有出版物已试图研究采用不同模态的各种HAR方法。我们的任务是创建一个使用CNN的图像分类模型，以确定人类正在进行哪种类型的活动。

代码

导入库

import numpy as np
import pandas as pd
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import time
import matplotlib.pyplot as plt
import cv2
import seaborn as sns
sns.set_style('darkgrid')
import shutil
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Activation,Dropout,Conv2D, MaxPooling2D,BatchNormalization
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras import regularizers
from tensorflow.keras.models import Model

加载数据集

一旦图像被加载到系统中，就会创建一个结构化的数据框（dataframe）。这个数据框提供了图像的特定信息，例如文件位置和类别标签。类别标签对于根据指定标准对图片进行分类至关重要，为后续的分析和模型训练奠定了基础。这种有组织的格式简化了图像数据的管理和处理，以便于后续步骤。

train_csv_path=r'../input/human-action-recognition-har-dataset/Human Action Recognition/Training_set.csv'
test_csv_path=r'../input/human-action-recognition-har-dataset/Human Action Recognition/Testing_set.csv'
train_img_path=r'../input/human-action-recognition-har-dataset/Human Action Recognition/train'
test_img_path=r'../input/human-action-recognition-har-dataset/Human Action Recognition/Testing_set.csv'
df=pd.read_csv(train_csv_path)
# modify df to have column names filepaths and labels
df.columns=['filepaths', 'labels']
# modify df so entries in filepaths column are the full path to the image file
df['filepaths']=df['filepaths'].apply(lambda x: os.path.join(train_img_path, x))
# split df into a train_df , a valid_df and a test_df
train_df, dummy_df=train_test_split(df, train_size=.9, shuffle=True, random_state=123, stratify=df['labels'])
valid_df, test_df= train_test_split(dummy_df, train_size=.5, shuffle=True, random_state=123, stratify=dummy_df['labels'])     
print('train_df lenght: ', len(train_df), '  test_df length: ', len(test_df), '  valid_df length: ', len(valid_df))
# get the number of classes and the image count for each class in train_df
classes=sorted(list(train_df['labels'].unique()))
class_count = len(classes)
print('The number of classes in the dataset is: ', class_count)
groups=train_df.groupby('labels')
print('{0:^30s} {1:^13s}'.format('CLASS', 'IMAGE COUNT'))
countlist=[]
classlist=[]
for label in sorted(list(train_df['labels'].unique())):
    group=groups.get_group(label)
    countlist.append(len(group))
    classlist.append(label)
    print('{0:^30s} {1:^13s}'.format(label, str(len(group))))

# get the classes with the minimum and maximum number of train images
max_value=np.max(countlist)
max_index=countlist.index(max_value)
max_class=classlist[max_index]
min_value=np.min(countlist)
min_index=countlist.index(min_value)
min_class=classlist[min_index]
print(max_class, ' has the most images= ',max_value, ' ', min_class, ' has the least images= ', min_value)
# let us get the average height and width of a sample of the train images
ht=0
wt=0
# Select 100 random samples of train_df
train_df_sample=train_df.sample(n=100, random_state=123,axis=0)
for i in range (len(train_df_sample)):
    fpath=train_df_sample['filepaths'].iloc[i]
    img=plt.imread(fpath)
    shape=img.shape
    ht += shape[0]
    wt += shape[1]
print('average height= ', ht//100, ' average width= ', wt//100, 'aspect ratio= ', ht/wt)

输出

Human Activity Recognition Using Machine Learning

选择特征

下面提供的修剪函数接受一个数据框以及最大样本数（max_samples）、最小样本数（min_samples）和一个列名作为参数。它返回一个数据框，其中任何类别的图像数量都不超过 max_samples。如果一个类别的图像少于 min_samples，它将从数据框中被移除。

def trim(df, max_samples, min_samples, column):
    df=df.copy()
    groups=df.groupby(column)    
    trimmed_df = pd.DataFrame(columns = df.columns)
    groups=df.groupby(column)
    for label in df[column].unique(): 
        group=groups.get_group(label)
        count=len(group)    
        if count > max_samples:
            sampled_group=group.sample(n=max_samples, random_state=123,axis=0)
            trimmed_df=pd.concat([trimmed_df, sampled_group], axis=0)
        else:
            if count>=min_samples:
                sampled_group=group        
                trimmed_df=pd.concat([trimmed_df, sampled_group], axis=0)
    print('after trimming, the maximum samples in any class is now ',max_samples, ' and the minimum samples in any class is ', min_samples)
    return trimmed_df

max_samples=300 # Since each class has more than 300 images all classes will be trimmed to have 300 images per class
min_samples=300
column='labels'
train_df= trim(train_df, max_samples, min_samples, column)

输出

分割数据集

我们将把数据集分割为训练集、测试集和验证集。

working_dir=r'./'
img_size=(200,260)
batch_size=30 # We will use an EfficientetB3 model, with an image size of (200, 250) this size should not cause resource error
trgen=ImageDataGenerator(horizontal_flip=True,rotation_range=20, width_shift_range=.2,
                                  height_shift_range=.2, zoom_range=.2 )
t_and_v_gen=ImageDataGenerator()
msg='{0:70s} for train generator'.format(' ')
print(msg, '\r', end='') # prints over on the same line
train_gen=trgen.flow_from_dataframe(train_df, x_col='filepaths', y_col='labels', target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=True, batch_size=batch_size)
msg='{0:70s} for valid generator'.format(' ')
print(msg, '\r', end='') # prints over on the same line
valid_gen=t_and_v_gen.flow_from_dataframe(valid_df, x_col='filepaths', y_col='labels', target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=False, batch_size=batch_size)
# For the test_gen we want to calculate the batch size and test steps such that batch_size X test_steps= number of samples in the test set
# This ensures that we go through all the samples in the test set exactly once.
length=len(test_df)
test_batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=80],reverse=True)[0]  
test_steps=int(length/test_batch_size)
msg='{0:70s} for test generator'.format(' ')
print(msg, '\r', end='') # prints over on the same line
test_gen=t_and_v_gen.flow_from_dataframe(test_df, x_col='filepaths', y_col='labels', target_size=img_size,
                                   class_mode='categorical', color_mode='rgb', shuffle=False, batch_size=test_batch_size)
# From the generator we can get the information we will need later
classes=list(train_gen.class_indices.keys())
class_indices=list(train_gen.class_indices.values())
class_count=len(classes)
labels=test_gen.labels
print ( 'test batch size: ' ,test_batch_size, '  test steps: ', test_steps, ' number of classes : ', class_count)

输出

让我们看一些来自训练集的图像。

def show_image_samples(gen ):
    t_dict=gen.class_indices
    classes=list(t_dict.keys())    
    images,labels=next(gen) # Get a sample batch from the generator 
    plt.figure(figsize=(20, 20))
    length=len(labels)
    if length<25:   #show maximum of 25 images
        r=length
    else:
        r=25
    for i in range(r):        
        plt.subplot(5, 5, i + 1)
        image=images[i] /255       
        plt.imshow(image)
        index=np.argmax(labels[i])
        class_name=classes[index]
        plt.title(class_name, color='blue', fontsize=12)
        plt.axis('off')
    plt.show()
    
show_image_samples(train_gen )

输出

模型

在这里，我们将创建模型。我们将使用迁移学习和EfficientNetB3模型。

img_shape=(img_size[0], img_size[1], 3)
model_name='EfficientNetB3'
base_model=tf.keras.applications.efficientnet.EfficientNetB3(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max') 
# Note you are always told NOT to make the base model trainable initially- that is WRONG you get better results leaving it trainable
base_model.trainable=True
x=base_model.output
x=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(256, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.4, seed=123)(x)       
output=Dense(class_count, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
lr=.001 # Start with this learning rate
model.compile(Adamax(learning_rate=lr), loss='categorical_crossentropy', metrics=['accuracy']) 

class ASK(keras.callbacks.Callback):
    def __init__ (self, model, epochs,  ask_epoch): # initialization of the callback
        super(ASK, self).__init__()
        self.model=model               
        self.ask_epoch=ask_epoch
        self.epochs=epochs
        self.ask=True # if True query the user on a specified epoch
        
    def on_train_begin(self, logs=None): # This runs on the beginning of training
        if self.ask_epoch == 0: 
            print('you set ask_epoch = 0, ask_epoch will be set to 1', flush=True)
            self.ask_epoch=1
        if self.ask_epoch >= self.epochs: # you are running for epochs but ask_epoch>epochs
            print('ask_epoch >= epochs, will train for ', epochs, ' epochs', flush=True)
            self.ask=False # do not query the user
        if self.epochs == 1:
            self.ask=False # running only for 1 epoch so do not query the user
        else:
            print('Training will proceed until epoch', ask_epoch,' then you will be asked to') 
            print(' enter H to halt training or enter an integer for how many more epochs to run then be asked again')  
        self.start_time= time.time() # set the time at which training started
        
    def on_train_end(self, logs=None):   # runs at the end of training     
        tr_duration=time.time() - self.start_time   # determine how long the training cycle lasted         
        hours = tr_duration // 3600
        minutes = (tr_duration - (hours * 3600)) // 60
        seconds = tr_duration - ((hours * 3600) + (minutes * 60))
        msg = f'training elapsed time was {str(hours)} hours, {minutes:4.1f} minutes, {seconds:4.2f} seconds)'
        print (msg, flush=True) # print out training duration time
        
    def on_epoch_end(self, epoch, logs=None):  # method runs on the end of each epoch
        if self.ask: # are the conditions right to query the user?
            if epoch + 1 ==self.ask_epoch: # is this epoch the one for quering the user?
                print('\n Enter H to end training or  an integer for the number of additional epochs to run then ask again')
                ans=input()
                
                if ans == 'H' or ans =='h' or ans == '0': # quit training for these conditions
                    print ('you entered ', ans, ' Training halted on epoch ', epoch+1, ' due to user input\n', flush=True)
                    self.model.stop_training = True # halt training
                else: # user wants to continue training
                    self.ask_epoch += int(ans)
                    if self.ask_epoch > self.epochs:
                        print('\nYou specified maximum epochs of as ', self.epochs, ' cannot train for ', self.ask_epoch, flush =True)
                    else:
                        print ('you entered ', ans, ' Training will continue to epoch ', self.ask_epoch, flush=True)

回调（callback）允许用户在训练过程中的关键阶段动态选择是继续还是停止训练。它通过让用户根据自己的偏好调整训练时长，为他们提供了灵活性和控制力。

然后，我们将实例化自定义回调，并创建2个回调来控制学习率和早停（early stop）。

epochs=40
ask_epoch=10
ask=ASK(model, epochs,  ask_epoch)
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=2,verbose=1)
estop=tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=4, verbose=1,restore_best_weights=True)
callbacks=[rlronp, estop, ask]

训练模型

我们将让模型在训练集上进行训练。

history=model.fit(x=train_gen,  epochs=epochs, verbose=1, callbacks=callbacks,  validation_data=valid_gen,
               validation_steps=None,  shuffle=False,  initial_epoch=0)

输出

def tr_plot(tr_data, start_epoch):
    #Plot the training and validation data
    tacc=tr_data.history['accuracy']
    tloss=tr_data.history['loss']
    vacc=tr_data.history['val_accuracy']
    vloss=tr_data.history['val_loss']
    Epoch_count=len(tacc)+ start_epoch
    Epochs=[]
    for i in range (start_epoch ,Epoch_count):
        Epochs.append(i+1)   
    index_loss=np.argmin(vloss)#  this is the epoch with the lowest validation loss
    val_lowest=vloss[index_loss]
    index_acc=np.argmax(vacc)
    acc_highest=vacc[index_acc]
    plt.style.use('fivethirtyeight')
    sc_label='best epoch= '+ str(index_loss+1 +start_epoch)
    vc_label='best epoch= '+ str(index_acc + 1+ start_epoch)
    fig,axes=plt.subplots(nrows=1, ncols=2, figsize=(20,8))
    axes[0].plot(Epochs,tloss, 'r', label='Training loss')
    axes[0].plot(Epochs,vloss,'g',label='Validation loss' )
    axes[0].scatter(index_loss+1 +start_epoch,val_lowest, s=150, c= 'blue', label=sc_label)
    axes[0].set_title('Training and Validation Loss')
    axes[0].set_xlabel('Epochs')
    axes[0].set_ylabel('Loss')
    axes[0].legend()
    axes[1].plot (Epochs,tacc,'r',label= 'Training Accuracy')
    axes[1].plot (Epochs,vacc,'g',label= 'Validation Accuracy')
    axes[1].scatter(index_acc+1 +start_epoch,acc_highest, s=150, c= 'blue', label=vc_label)
    axes[1].set_title('Training and Validation Accuracy')
    axes[1].set_xlabel('Epochs')
    axes[1].set_ylabel('Accuracy')
    axes[1].legend()
    plt.tight_layout    
    plt.show()
    
tr_plot(history,0)

输出

在测试集上进行预测

我们将定义一个函数，它接受一个测试生成器（test generator）和一个整数 test_steps

并在测试集上生成预测，包括一个混淆矩阵和一个分类报告。

def predictor(test_gen, test_steps):
    y_pred= []
    y_true=test_gen.labels
    classes=list(train_gen.class_indices.keys())
    class_count=len(classes)
    errors=0
    preds=model.predict(test_gen, steps=test_steps, verbose=1) # predict on the test set
    tests=len(preds)
    for i, p in enumerate(preds):
            pred_index=np.argmax(p)         
            true_index=test_gen.labels[i]  # labels are integer values
            if pred_index != true_index: # a misclassification has occurred                                           
                errors=errors + 1
            y_pred.append(pred_index)
    acc=( 1-errors/tests) * 100
    print(f'there were {errors} in {tests} tests for an accuracy of {acc:6.2f}')
    ypred=np.array(y_pred)
    ytrue=np.array(y_true)
    if class_count <=30:
        cm = confusion_matrix(ytrue, ypred )
        # plot the confusion matrix
        plt.figure(figsize=(12, 8))
        sns.heatmap(cm, annot=True, vmin=0, fmt='g', cmap='Blues', cbar=False)       
        plt.xticks(np.arange(class_count)+.5, classes, rotation=90)
        plt.yticks(np.arange(class_count)+.5, classes, rotation=0)
        plt.xlabel("Predicted")
        plt.ylabel("Actual")
        plt.title("Confusion Matrix")
        plt.show()
    clr = classification_report(y_true, y_pred, target_names=classes, digits= 4) # create classification report
    print("Classification Report:\n----------------------\n", clr)
    return errors, tests
errors, tests=predictor(test_gen, test_steps)

输出

看起来不错，准确率达到了79%。

现在，我们必须保存模型。

subject='activities' 
acc=str(( 1-errors/tests) * 100)
index=acc.rfind('.')
acc=acc[:index + 3]
save_id= subject + '_' + str(acc) + '.h5' 
model_save_loc=os.path.join(working_dir, save_id)
model.save(model_save_loc)
print ('model was saved as ' , model_save_loc ) 

输出

处理测试集

test_csv_path=r'../input/human-action-recognition-har-dataset/Human Action Recognition/Testing_set.csv'
test_img_path=r'../input/human-action-recognition-har-dataset/Human Action Recognition/test'
test_df=pd.read_csv(test_csv_path)
# modify df to have column name filepaths 
test_df.columns=['filepaths']
# modify df so entries in filepaths column are the full path to the image file
test_df['filepaths']=test_df['filepaths'].apply(lambda x: os.path.join(test_img_path, x))
# create the test generator
length=len(test_df)
test_batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=80],reverse=True)[0]  
test_steps=int(length/test_batch_size)
msg='{0:70s} for test generator'.format(' ')
print(msg, '\r', end='') # prints over on the same line
test_gen=t_and_v_gen.flow_from_dataframe(test_df, x_col='filepaths', y_col=None, target_size=img_size,
                                   class_mode=None, color_mode='rgb', shuffle=False, batch_size=test_batch_size)
# make predictions on the test dataframe
image_paths=[]
pred_class=[]
preds=model.predict(test_gen, steps=test_steps, verbose=1)
# iterate through predictions to create submit_df 
for i, p in enumerate (preds):
    index=np.argmax(p)
    klass=classes[index]
    pred_class.append(klass)
    file=test_gen.filenames[i]
    image_id=os.path.basename(file)
    image_paths.append(image_id)
Fseries=pd.Series(image_paths)
Lseries=pd.Series(pred_class)
submit_df=pd.concat([Fseries, Lseries], axis=1)
submit_df.columns=['filename', 'class']
print(submit_df.head())
submit_path=os.path.join(working_dir, 'submit.csv')
submit_df.to_csv(submit_path,index=False) 
# read back in the csv file to see if it is correct
check_df=pd.read_csv(submit_path)
print (check_df.head())

输出

模型运行良好。

下一个主题GIS的组成部分

机器学习中的人类活动识别

关于数据集

导入库

加载数据集

选择特征

模型

训练模型

在测试集上进行预测

处理测试集

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

机器学习

监督式学习

分类

杂项

相关教程

面试题

机器学习中的人类活动识别

关于数据集

导入库

加载数据集

选择特征

模型

训练模型

在测试集上进行预测

处理测试集

相关帖子

Dropout 率

图像分割中的 V-Net

连续概率分布

进化算法简介

机器学习的重要性

拉普拉斯算子

贝叶斯深度学习简介

机器学习中的矩阵分解

机器学习中的数据增强

抽样技术类型

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器