'분류 전체보기' 카테고리의 글 목록

분류 전체보기

[논문/Action Recognition] Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition 2018.09.19
[논문 / Action Recognition] PoTion : Pose MoTion Representation for Action Recognition 2018.09.07
[응봉산 야경] 응봉산 야경 구경하기 2018.09.03
[분당/정자] 스시쿤 / 사진 2018.09.02
[샤로수길 맛집] 쿠모식당 후기 2018.08.22
[논문/Action Recognition] I3D와 Kinetics dataset 2018.08.22 2
MacOS에 OpenCV 3.3.0 설치하려고 한 후기 2018.08.22
[휴학] 아무것도 아닌 채로 1년 지내는 후기 2018.08.03
[진학] 진학 6개월차 후기 2018.08.03
[취업] 입사 1일차 후기 2017.12.18 1

[논문/Action Recognition] Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

2018. 9. 19. 22:17

Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

Shuyang Sun, Zhanghui Kuang, Lu Sheng, Wanli Ouyang, Wei Zhang
The University of Sydney, SenseTime Research, The Chinese University of Hong Kong

Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

Abstract

Novel compact motion representation method, named Optical Flow guided Feature (OFF)
OFF can be embedded in any framework.

Alt text

1. Introduction

Temporal information is the key.
Optical flow is useful motion representation, but inefficient.
3D CNN does not perform as well as Two-stream networks with optical flow.
OFF is a new feature representation from orthogonal space of optical flow on feature level.
- Spatial gradients of feature maps in horizontal, vertical directions
- Temporal gradients

Hand-crafted features
Deep-features
- Optical flow
- 3D CNN
- RNN
OFF
- Well captures the motion patterns
- Complementary to other motion representations

3. Optical Flow Guided Feature : OFF

Optical Flow
- - : pixel at the location of a frame t
  - : spatial pixel displacement in each axes
Apply at feature level
- - : mapping function for extracting features from image
  - : parameters in
According to definition of optical flow
- - : feature level optical flow
OFF :
- Orthogonal to feature level optical flow and changes as it changes.
- Encodes spatial-temporal information orthogonally and complementarily to

4. Using Optical Flow Guided Feature in CNN

4.1. Network Architecture

Feature Generation Sub-network

BN-Inception for extracting feature map

OFF Sub-network

Alt text

1x1 convolutional layer
Apply Sobel operator for spatial gradients
Element-wise subtraction for temporal gradients
Concatenate features from lower level.

Classification Sub-network

Multiple inner-product classifiers for each features
Classification scores are averaged

4.2. Network Training

th segment on level :
Classification score of :
- is average pooling for summarizing scores
Cross-entropy loss for each level
- - : number of categories
  - : ground-truth class label
Two-stage training
- Train feature generation sub-network first.
- Train classification sub-network with feature network frozen.

4.3. Network Testing

Test under TSN framework
25 segments are sampled from RGB
th segment is treated as Frame

5. Experiments and Evaluations

5.1. Datasets and Implementation Details

UCF-101 / HMDB-51 datasets
4 NVIDIA TITAN X GPUs
Caffe & OpenMPI
Train feature generation network by TSN method
Train OFF sub-networks from scratch with feature generation networks frozen.

5.2. Experimental Investigations of OFF

Efficiency
- State-of-the art among real-time methods

Alt text

Effectiveness
- Investigate the roustness of OFF when applying different inputs.

Alt text

Comparison
- 2.0%/5.7% gain compared with the baseline Two-Stream TSN

Alt text

6. Conclusion

OFF is fast(200fps) and robust.
The result with only RGB input is comparable to Two-stream approaches.
Complementary to other motion representations.

저작자표시

'Computer Science' 카테고리의 다른 글

[논문 / Action Recognition] PoTion : Pose MoTion Representation for Action Recognition (0)	2018.09.07
[논문/Action Recognition] I3D와 Kinetics dataset (2)	2018.08.22
MacOS에 OpenCV 3.3.0 설치하려고 한 후기 (0)	2018.08.22

[논문 / Action Recognition] PoTion : Pose MoTion Representation for Action Recognition

2018. 9. 7. 13:54

PoTion : Pose MoTion Representation for Action Recognition

Vasileios Choutas, Philippe Weinzaepfel, Jérôme Revaud, Cordelia Schmid
Inria, NAVER LABS Europe

PoTion : Pose MoTion Representation for Action Recognition

Abstract

State-of-the-art methods for action recognition rely on two-stream networks
Try to process appearance and motion jointly
Fixed-sized representation that encodes pose over clip

Alt text

1. Introduction

Human pose can be added to multi-stream architecture
3D skeleton
- Limited to case where the data is available
2D poses
- Used when fully-visible
- Features(hand-crafted, CNN) around the human joints
Zolfaghari et al.
- Pose stream for semantic segmentation maps
- FC + spatio-temporal CNN
Propose to focus on the movement of a few relevant keypoints
- Fixed-sized representation, does not depend on the duration of the clip.
Overview of PoTion
- Obtain heat maps for human joint by running pose estimator (Part Affinity Fields).
- Colorize heat maps depending on the relevant time.
- Obtain PoTion representation for clip.
- Train shallow CNN architecture for action classification.
Contributions
- Clip-level representation that encodes human pose motion, PoTion
- Study PoTion representation and CNN for action recognition
- Combin PoTion with two-stream architecture

CNNs for action recognition
- 2D + RNN
- 3D CNN
- Two-stream
Motion representation
- Optical flow, etc.
Pose representation
- 3D skeleton
- 2D pose
- Approach of Zolfaghari et al.

3. PoTion representation

3.1. Extracting joint heatmaps

Obtain human joint heatmaps by Part Affinity Fields
Part Affinity Fields
- Robust to occlusion and truncation
- Output : joint heatmap & affinity map
- Use only joint heatmap
Is the likelihood of pixel

3.2. Time-dependent heat map colorization

Alt text

‘Colorize’ according to relative time of this frame to
For C = 2
For multiple channels C
- Split T frames into C-1 regularly sampled intervals

3.3. Aggregation of colorized heatmaps

Alt text

Compute sum of the colorized heat maps
Obtain invariant representation by normalizing channel independently.
Compute intensity image
- Encodes how much time a joint stays at each location
Normalized PoTion representation : divide by intensity
- All locations of the motion trajectory are weighted eqally

4. CNN on PoTion representation

Alt text

Network architecture
- 3 blocks with 2 convolutional layers in each block
- 3 x 3 kernel , stride 2 and then stride 1
- Global average pooling + FC + softmax after all blocks
- Batch normalization, ReLU after each Conv.
Implementation details
- Xavier initialization and train from scratch
- Dropout of 0.25
- Adam optimizer
- Batch size of 32
- 4 hours on single GPU (Titan X)
- Data augmentation
  - Flipping by swapping channels was efficient
  - Random cropping, smoothing heat maps, shifting did not gain

5. Experimental results

Alt text

5.1. Datasets and metrics

HMDB
JHMDB
UCF101
Kinetics
Report mean classification accuracy.

5.2 PoTion representation

Number of channels
Aggregation techniques

5.3. CNN on PoTion

Data augmentation
Network Architecture
- Number of convolution layers per block : 2
- Number of blocks : 3
- Number of convolution filters : (128, 256, 512)

5.4. Imapct of pose estimation

Alt text

Groundtruth pose : 4% gain
Crop frames centered on the actor (GT-JHMDB) : 6% gain

5.5. Comparison to the state of the art

Alt text

Multi-stream approach
- Up to +8% on TSN / Up to 3% on I3D
Comparison to Zolfahari et al.
- Improvement due to improved pose motion representation
Comparison to the state of the art
- Outperform all existing approaches
Detailed analysis
- Clear improvement when well defined motion
- Low performance when object is more important than motion
Results on Kinetics
- Accuracy decreased by 1~2%
- Reasons of decrease
  - Actors are partially visible
  - Feature erratic camera
  - Multiple shots per clip

6. Conclusion

PoTion encodes the motion over entire clip into fixed-size representation.
Classification with PoTion representation and shallow CNN.
Leads to state-of-the-art performance on JHMDB, HMDB, and UCF-101.

Our discussion

Why not use 3D convolution for capturing temporal features?

저작자표시

'Computer Science' 카테고리의 다른 글

[논문/Action Recognition] Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition (0)	2018.09.19
[논문/Action Recognition] I3D와 Kinetics dataset (2)	2018.08.22
MacOS에 OpenCV 3.3.0 설치하려고 한 후기 (0)	2018.08.22

[응봉산 야경] 응봉산 야경 구경하기

2018. 9. 3. 01:24

저녁을먹고 학교에서 집가다가 갑자기 마음이 답답해서 응봉산에 가기로했다.

응봉산은 지하철 응봉역에서 내리면 정말 가깝다.

올라가면 엄청 높은데, 많이 올라가지 않아도 되서?

가볍게 산책하러 올라가서 야경보기 좋은 곳이다.

응봉역 1번출구로 나와서 표시한곳으로 쭉 따라가면 된다.

응봉산 가는길이라고 중간중간에 도로에 표시가 잘 되어있으니 그냥 보면서 쭉 가면 된다.

가는길에 CU 응봉초점 이 있으니 편의점에서 음료수나 간식을 사가도 좋다.

응봉산 정상에는 벤치가 꽤 있어서 정상에서 음료수를 마시기 좋다.

도로를 따라서 올라가다보면 오른쪽에 구름다리가 있다.

구름다리를 건너가는 재미도 있다.

응봉산 정상으로 가는 길은 여러가지이다.

쭉 큰 도로쪽으로 따라가다가 구름다리 있는쯤 해서

산 안으로 들어가는곳이 나오는데, 그곳으로 들어가서 산을 올라가다보면 응봉산 정상이 나온다.

응봉산 정상에는 불이 켜져있는 정자가 있다.

위에 공간이 꽤 넓다.

사방으로 모두 야경을 감상할 수 있다.

데크 같이 해 놓은 곳도 있어서 이곳에서 야경을 구경할 수 있다.

저곳에 서 있으면 이렇게 이쁜 야경을 구경할 수 있다.

다른쪽에서는 남산도 보인다.

불 켜져있는것이 참 예쁘다.

응봉역에서 정말 얼마 안걸리기 때문에(한 15분?)

잠깐 놀러다녀오기 좋다.

산을 많이 안타는데도 정상에 올라가면 높아서 적은 힘을 들이고 아주 멋진 야경을 감상할 수 있다.

요즘 날씨도 좋으니 응봉산 산책 강추!

저작자표시

'일상' 카테고리의 다른 글

[따릉이]경복궁 따릉이 리뷰 (0)	2017.10.23
[AVEDA 블루오일] 아베다 블루오일, 아베다 쿨링 밸런싱 오일 컨센트레이트 리뷰 (0)	2017.10.17
[슬라임 액체괴물] 슬라임 리뷰 (0)	2017.10.02
[토모마스 과일 사이다] 망고사이다 리뷰 (0)	2017.10.01
[IKEA] 이케아 광명점 방문 리뷰 (0)	2017.09.28

[분당/정자] 스시쿤 / 사진

2018. 9. 2. 21:55

자완무시

산마와 ?

마다이 / 참돔

하마치 / 새끼 방어

간장 유자 젤리를 곁들인 히라메 / 광어

무시아와비, 타코, 모찌리 도후, 샤리

마끼

노도구로 / 금태

오도로 / 뱃살

산마를 얹은 아카미 즈케

스이모노 / 국

니싱 / 청어

우니 아부리, 호타테 / 성게를 올려 구운 관자

스니즈리 / 참치 배꼽살

이까 / 오징어

간뾰와 야마고뵤를 생선에 말아 튀긴 마끼 / 우엉, 박고지

일본식 보리된장을 얹은 오이

장국

마다이 / 참돔

하마치 / 새끼 방어

참치... 아카미?

쥬도로 / 중뱃살

훈연한 삼치와 양파 절임

초절임 고하다 / 전어

우니 / 성게

네기도로

카이센동

새우튀김

대하

표고

아나고 / 장어

후토마끼

교꾸

디저트 (멜론 / 토마토)

저작자표시

[샤로수길 맛집] 쿠모식당 후기

2018. 8. 22. 14:07

예전에 인스타에서 보고 즉흥적으로 찾아간 집이다.

그래서 사진도 인스타처럼 찍어보고 싶었는데 똥손이었다.

시간이 좀 지나서 기억이 흐려져서 나도 검색해서 기억을 되살리며 쓰고있다^___^

구글링은 참 좋은 것이다.

1. 메뉴 + 맛

요즘 유행하는 일본 가정식 느낌은 나는데.. 한식이나 다른게 섞인 퓨전같은 느낌

연어장이 먹고싶었지만.. 우리는 먹는 양이 적어서 컷트했다.

미소돼지목살구이덮밥 + 카츠산도 + 수제고로케 를 시켰다.

나베가 유명하던데.. 나베를 시키는 테이블이 많았다. 근데 나는 그냥 내가 먹고싶은거 시켰다.

맛집의 유명한 메뉴가 있더라도 그거보다 내가 먹고싶은 메뉴를 먹어야 행복하기 때문이다.

아무튼 맛은.. 맛있다. 막 엄청난 맛!! 이런건 아니지만 맛있다.

샤로수길까지 멀리가서 9000-11000원대의 메뉴 가격에 먹을만한 맛이었다.

우리 일행은 특히 카츠산도를 맛있어했다. 사실 나는 카츠산도라는걸 제대로 먹어본게 처음이었는데,

안에 들어간 카츠가 기름지지도 딱딱하지도 푸석하지도 않고 (아무튼 나쁜 식감 아님)

너무 짜지도 자극적이지도 달지도 않고 (나쁜 맛도 아님)

적당히 달달하면서도 카츠스럽고 아무튼 맛있었다. 솔직히 난 메뉴 3개 중에 이게 제일 좋았다.

2. 웨이팅

월요일인가 화요일.. 저녁시간대에 갔지만 그래도 평일이라 좀 덜할줄 알았는데

아무래도 요즘 핫한 동네에 핫한 집이라 그런가 웨이팅 꽤 된다. 거의 40분 넘게 기다린거 같은데

너무 너무 덥고 끈적한데 너무 너무 배고파서 죽을뻔했다.

웨이팅은 살짝 그늘지게 의자하나 있긴한데 그냥 보통 문 앞에 일렬로 쭉 햇빛에서 기다린다.

기본적으로 가게 규모가 작고 테이블 수가 많지가 않은데, 음식 나오는 시간도 대체적으로 다른 식당들보다 훨씬 긴 편이었다.

그래서 웨이팅이 더 길어지는 것 같긴하다.

3. 결론

맛있으니까 한번쯤 가도 좋을 것 같다. 카츠산도 또 먹고싶다.

저작자표시

'맛집' 카테고리의 다른 글

[왕십리맛집] 로컬 쌀국수 맛집 - 팜티진 쌀국수 (0)	2017.09.20

[논문/Action Recognition] I3D와 Kinetics dataset

2018. 8. 22. 13:35

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

원 논문 : https://arxiv.org/abs/1705.07750

그동안의 Vision 연구에서는 ImageNet에서 classifier를 학습시켰을 때, 이러한 네트워크에서 뽑아낸 feature로 transfer learning을 하면, PASCAL VOC image recognition과 같이 다른 task에서도 잘 활용할 수 있다는 것이 밝혀졌다. 이 논문에서는 이러한 아이디어가 이미지가 아닌 비디오 환경에서도 잘 작동하는지 확인하고 있다.
저자들은 ImageNet pre-training의 장점을 살리기 위한 새로운 two-stream 방식의 I3D 모델을 제안하면서, 기존의 Action recognition 모델 (CNN + LSTM, C3D, Two-stream, Two-stream fusion) 소개하고, basline으로 사용하여 새롭게 제시한 I3D 모델과 비교하였다.
ImageNet classifier와 같은 효과를 얻기 위해, Kinetics dataset을 새로 수집하였는데, action class는 400개, 각 class마다 400개 이상의 영상을 포함한다.

실험 결과, Two-stream(RGB + Optical Flow)의 방식이 RGB만 사용한 것보다 항상 성능이 좋게나왔으며, Kinetics에서 pre-training을 시켜 UCF/HMDB에 적용했을 때 더 좋은 결과를 얻을 수 있었다.
따라서, 비디오 연구에서 Kinetics dataset을 이용하면 ImageNet pre-training과 같은 효과를 얻을 수 있는 것은 확실에 보이나, semantic video segmentation이나 optical flow computation과 같이 다른 연구에서도 효과적일지는 불확실하므로 추후 연구를 위해 Kinetics 데이터셋에 학습시킨 I3D 모델을 공개한다고 하였다.
딥마인드 짱

저작자표시

'Computer Science' 카테고리의 다른 글

[논문/Action Recognition] Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition (0)	2018.09.19
[논문 / Action Recognition] PoTion : Pose MoTion Representation for Action Recognition (0)	2018.09.07
MacOS에 OpenCV 3.3.0 설치하려고 한 후기 (0)	2018.08.22

MacOS에 OpenCV 3.3.0 설치하려고 한 후기

2018. 8. 22. 11:45

OpenCV 3.3.0 빌드해서 설치하기

pip install opencv-python로 OpenCV를 설치할 경우 비공식 패키지이기 때문에 configuration도 힘들고 OpenCV의 모든 기능을 사용하려면 당연히 직접 빌드해야 한다.
직접 git에서 pull을 받아 빌드를 해보도록 하자.
https://www.pyimagesearch.com/2015/06/15/install-opencv-3-0-and-python-2-7-on-osx/ 이 블로그 참고하면 쉽다.

git clone https://github.com/Itseez/opencv
git clone https://github.com/Itseez/opencv_contrib
둘다 git checkout 3.3.0해서 3.3.0으로 버전 통일
mkdir opencv/build && cd opencv/build
cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX={{site-packages}} \
-D PYTHON2_LIBRARY={{python/bin}} \
-D PYTHON2_INCLUDE_DIR={{include/python}} \
-D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON \
-D BUILD_EXAMPLES=ON \
-D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules .. \
-D WITH_FFMPEG=ON

나는 FFMPEG의 비디오 코덱을 사용할 것이기 때문에 마지막줄을 추가했다.
cmake가 안되면 brew install cmake로 설치하고, brew도 없으면.. 알아서 설치하고, {{ }} 사이에 있는 것들을 확인하는 방법은 Python을 실행해서

>> from sysconfig import get_paths
>> from pprint import pprint
>> info = get_paths()
>> pprint(info)
해서 나오는 것들을 보고 집어넣는다.

make (-j4)
make install
ldconfig

하지만 이렇게 해도 절대 되지 않지. OpenCV는 개 똥같으니깐.
3.3.0버전에서 CUDA 9.0과 충돌이 나기 시작한다. CUDA의 nppi 라이브러리가 사방 팔방 흩어졌기 때문이다
이 이슈에 관한 글은 https://stackoverflow.com/questions/46584000/cmake-error-variables-are-set-to-notfound 여기서 확인

고쳐보자
opencv/cmake/FindCUDA.cmake를 보자. 3.3.0버전 기준으로 809번째줄에 find_cuda_helper_libs(nppi)가 문제
find_cuda_helper_libs(nppial)
find_cuda_helper_libs(nppicc)
find_cuda_helper_libs(nppicom)
find_cuda_helper_libs(nppidei)
find_cuda_helper_libs(nppif)
find_cuda_helper_libs(nppig)
find_cuda_helper_libs(nppim)
find_cuda_helper_libs(nppist)
find_cuda_helper_libs(nppisu)
find_cuda_helper_libs(nppitc)
이렇게 바꾸자

805번째 줄의
set(CUDA_npp_LIBRARY "${CUDA_nppc_LIBRARY};${CUDA_nppi_LIBRARY};${CUDA_npps_LIBRARY}") 를
set(CUDA_npp_LIBRARY "${CUDA_nppc_LIBRARY};${CUDA_nppial_LIBRARY};${CUDA_nppicc_LIBRARY};${CUDA_nppicom_LIBRARY};${CUDA_nppidei_LIBRARY};${CUDA_nppif_LIBRARY};${CUDA_nppig_LIBRARY};${CUDA_nppim_LIBRARY};${CUDA_nppist_LIBRARY};${CUDA_nppisu_LIBRARY};${CUDA_nppitc_LIBRARY};${CUDA_npps_LIBRARY}")

524번째 줄의 unset(CUDA_nppi_LIBRARY CACHE) 를
unset(CUDA_nppial_LIBRARY CACHE)
unset(CUDA_nppicc_LIBRARY CACHE)
unset(CUDA_nppicom_LIBRARY CACHE)
unset(CUDA_nppidei_LIBRARY CACHE)
unset(CUDA_nppif_LIBRARY CACHE)
unset(CUDA_nppig_LIBRARY CACHE)
unset(CUDA_nppim_LIBRARY CACHE)
unset(CUDA_nppist_LIBRARY CACHE)
unset(CUDA_nppisu_LIBRARY CACHE)
unset(CUDA_nppitc_LIBRARY CACHE) 로 바꾸고
OpenCVDetectCUDA.cmake에서 65번째부터 시작하는 if문에서 Fermi(2.0)을 지우고 Kepler를 if로 처리
105번째 줄 set(__cuda_arch_bin "2.0 3.0 3.5 3.7 5.0 5.2 6.0 6.1") 에서도 2.0을 지우자
\modules\cudev\include\opencv2\cudev\common.hpp 에 헤더를 추가해야됨
#include <cuda_fp16.h>
이 라인을 적당한 곳에 끼워넣자

make
opencv_cudaarithm에서 뭔가 문제가 생겼다
opencv 랑 opencv_contrib 폴더를 다 지우자.

다시 설치해서 3.4.0말고 master브랜치에서 다시..시도....
CUDA관련 설정이 이미 돼있다.
-D BUILD_opencv_freetype=OFF 이 설정을 추가해서 cmake하면 성공
3.4.0버전도 뭔가 이상하다
3.1.0을 해보자.......
도커를 쓰자!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

저작자표시

'Computer Science' 카테고리의 다른 글

[논문/Action Recognition] Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition (0)	2018.09.19
[논문 / Action Recognition] PoTion : Pose MoTion Representation for Action Recognition (0)	2018.09.07
[논문/Action Recognition] I3D와 Kinetics dataset (2)	2018.08.22

[휴학] 아무것도 아닌 채로 1년 지내는 후기

2018. 8. 3. 15:04

앞날이 깜깜하다

퇴사하면 어떻게 지내지

컨택 답장은 안오고

오늘도 코드나 뒤적뒤적

저작자표시

[진학] 진학 6개월차 후기

2018. 8. 3. 14:45

저작자표시

[취업] 입사 1일차 후기

2017. 12. 18. 23:57

저작자표시 비영리 변경금지

PREV 1 2 3 4 NEXT