CS231n Assignment 1: Q5: Higher Level Representations: Image Features

내 풀이 github LINK: https://github.com/qkrtmdtj04/CS231n-Assignment

GitHub - qkrtmdtj04/CS231n-Assignment

Contribute to qkrtmdtj04/CS231n-Assignment development by creating an account on GitHub.

github.com

Image Features란? (Motivation)

Q4까지는 Neural Network를 직접 구현해 Raw Pixel 값을 그대로 입력으로 넣었다. 그런데 사실 픽셀 값 자체는 이미지의 의미를 담기에 굉장히 비효율적인 표현이다.

예를 들어, 같은 고양이 사진이라도 조명이 바뀌거나 위치가 조금만 달라져도 픽셀 값은 완전히 달라져버린다. 모델 입장에서는 "이게 같은 고양이야?"라고 혼란스러워할 수밖에 없다.

그래서 나온 아이디어가 바로 Feature Extraction이다. 픽셀을 그대로 쓰는 대신, 이미지에서 의미 있는 특징(Feature) 을 먼저 뽑아낸 뒤 그 특징 벡터를 분류기의 입력으로 사용하는 방식이다.

Q5에서 사용하는 두 가지 Feature

1. HOG (Histogram of Oriented Gradients)

이미지의 엣지(Edge) 방향 정보를 담는 특징이다. 이미지를 작은 셀(cell)로 나누고, 각 셀에서 픽셀 밝기의 변화 방향(Gradient)을 히스토그램으로 표현한다.

색깔 정보는 무시하고, 형태(Shape)와 윤곽선 위주로 이미지를 표현하기 때문에 조명 변화에 비교적 강하다.

2. Color Histogram

이미지를 HSV 색공간으로 변환한 뒤, 각 채널의 색상 분포를 히스토그램으로 표현한다. HOG가 형태를 담당한다면, Color Histogram은 색감 정보를 담당한다.

이 두 가지를 이어 붙여(Concatenate) 하나의 Feature Vector로 만들고, 이를 SVM 분류기의 입력으로 넣는 것이 Q5의 핵심 흐름이다.

Q5 Image Features 풀이

Q5-1: Feature 추출

- SVM모델을 사용해 위의 HOG, HSV를 추출한 것을 이용해 분류하는 코드이다. 결과는 아래와 같이 기존 SVM 모델에 개선된 성능인 41%를 볼 수 있다.

python

# Use the validation set to tune the learning rate and regularization strength

from cs231n.classifiers.linear_classifier import LinearSVM

learning_rates = [1e-9, 1e-8, 1e-7]
regularization_strengths = [5e4, 5e5, 5e6]

results = {}
best_val = -1
best_svm = None

################################################################################
# TODO:                                                                        #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save    #
# the best trained classifer in best_svm. You might also want to play          #
# with different numbers of bins in the color histogram. If you are careful    #
# you should be able to get accuracy of near 0.44 on the validation set.       #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

for lr in learning_rates:
    for reg in regularization_strengths:
        model = LinearSVM()
        model.train(X_train_feats,y_train,learning_rate=lr,reg=reg)
        y_train_pred = model.predict(X_train_feats)
        y_val_pred = model.predict(X_val_feats)

        train_acc = np.mean(y_train == y_train_pred)
        val_acc = np.mean(y_val == y_val_pred)
        if val_acc > best_val:
            best_val = val_acc
            best_svm = model

        results[(lr,reg)] = (train_acc,val_acc)

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

# Print out results.
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
                lr, reg, train_accuracy, val_accuracy))

print('best validation accuracy achieved: %f' % best_val)

Inline question 1:
Describe the misclassification results that you see. Do they make sense?
YourAnswer: 선의 특징을 추출해서 그런지 평평한 사물형태는 사물끼리 불안정하게 분류가 되어 있고 동물은 동물끼리는 되어 있는거 같다.

Q5-2: SVM 학습 및 하이퍼파라미터 튜닝

Q2에서 했던 방식과 동일하게, Validation Set을 이용해 최적의 learning_rate와 regularization_strength를 탐색한다.

python

best_accuracy = -1

learning_rates = [1, 0.1,1e-2]
regularization_strengths = [0., 0.1]

for lr in learning_rates:
    for reg in regularization_strengths:
        net = TwoLayerNet(input_dim,hidden_dim,num_classes,reg=reg)
        solver = Solver(net,data,optim_config={'learning_rate':lr},verbose=False)
        solver.train()
        acc = solver.check_accuracy(data["X_val"],data["y_val"])
        if acc > best_accuracy:
            best_accuracy = acc
            best_net = net

print(best_accuracy)

학습 결과
TESTSET:58%

Raw Pixel을 그대로 사용했을 때와 비교하면 Feature를 추출해서 학습한 모델이 눈에 띄게 높은 정확도를 보인다. 선형 분류기(SVM)임에도 불구하고, 어떤 입력을 주느냐가 성능에 얼마나 큰 영향을 미치는지 단적으로 보여주는 결과다.

Dev_PSS