saguar1

YANAI Lab.

電気通信大学 総合情報学科/大学院 総合情報学専攻 メディア情報学コース 柳井研究室
電気通信大学 > 情報工学科 > コンピュータ学講座 > 柳井研究室 > 研究紹介  

$BJ#?tFCD'E}9g%U%l!<%`%o!<%/$K$h$k1GA|G'<1(B

$BEr(B $B;V1s(B

$BJ?@.(B21$BG/(B 2$B7n(B6$BF|(B

1 $B$O$8$a$K(B

$BCO>eGH%G%8%?%kJ|Aw$N3+;O$d!$(BHDD$B$d(BDVD$B$J$I$N%a%G%#%"$NIa5Z$*$h$Sl$H$H$b$K!$0lHL$N8D?M$,BgNL$N%G%8%?%k2hA|$dF02h%G!<%?$rC_@Q$9$k$3$H$,$G$-$k$h$&$K$J$C$?!%(B $B0lJ}!$%G%8%?%k%G!<%?$NBgNL2=$H$H$b$K!$C_@Q$7$?BgNL$N%G!<%?$NCf$+$i!$1\Mw$7$?$$%G!<%?$rC5$7=P$9$3$H$,LdBj$H$J$C$F$$$k!%(B $B$3$N>u67$KBP$7$F!$(BTRECVID$B$H8F$P$l$k6&DL%F%9%H%3%l%/%7%g%s$rMQ$$$?%S%G%*8!:w5;=Q$K4X$9$k8&5f3+H/B%?J$N$?$a$N9q:]%o!<%/%7%g%C%W$,3+:E$5$l$F$$$k!%(B

$BK\8&5f$G$O!$B?]35G0$NG'<1$r9T$C$?!%$^$?!$0lHL$N(BSVM$B$NBe$j$K!$J#?t$NFCD'%+!<%M%k$rE}(B $B9g$9$k(BMKL SVM$B$b%U%l!<%`%o!<%/$KF3F~$7 2 TRECVID

TRECVID[1]$B$H$O!$%"%a%j%+$N9qN)I8=`5;=Q8&5f=j(BNIST(National Institute of Technology)$B$N8&5fItLg$,9T$&%F%-%9%H8!:w%o!<%/%7%g%C%W(BTREC(Text REtrieval Contest)$B$+$iGI@8$7$?%S%G%*1GA|8!:w%o!<%/%7%g%C%W$G$"$k!%KhG/6&DL$N%?%9%/$*$h$S3F%?%9%/$KBP$9$kI>2A4p=`$r@_Dj$7$F$$$k!%(B $B:#G/3+:E$5$l$?(BTRECVID2008$B$O(B

  • $B4F;k%$%Y%s%H8!=P(B(Surveillance event detection)

  • $B9b
  • $B8!:w(B(Search)

  • $B%i%C%7%eMWLs(B(Rushes summarization)

  • $BFbMF%Y!<%9%3%T!<8!=P(B(Content-based copy detection)

$B$N(B5$B$D$N%?%9%/$,MQ0U$5$l$F$$$k!%(B

$B3F%?%9%/$N TRECVID2008$B$N9b

3 $B4XO"8&5f(B

IBM$B8&5f=j$N(BMilind$B$i(B[]$B$O(BTRECVID$B%o!<%/%7%g%C%W$N%5!<%Y%$O@J8$G!$(B $B%S%G%*8!:w$dJ,N`$J$I$NJ,Ln$KBg$-$J2~A1$rM?$($k$3$H$r<($7$?!%$=$l0JMh!$(B $BJ#?tFCD'E}9g$N8&5f$O6aG/@9$s$G9T$o$l$F$$$k!%(B TRECVID2007$B$NBh(B1$B0L$N@62ZBg3X$O?'!$%F%/%9%A%c(B $B!$%(%C%8$K4p$E$$$?(B26$B2]$B$O(BMKL$B$N3X=,J}K!$rDs0F$7!$(BXoford flower$B$d(B Caltech101$B$J$I$N%G!<%?$KBP$7e$r2L$?$7$?!%(B

$B0J>e$N8&5f$rF'$^$(!$K\8&5f$G$b!$B?

4 $BJ#?tFCD'E}9g%U%l!<%`%o!<%/(B

$BK\8&5f$G$O!$B?1$B$N$h$&$K!$$^$:!$(BTRECVID2008$B$,Ds6!$7(B $B$?1GA|$+$i@Z$j=P$5$l$?%-!<%U%l!<%`$KBP$7$F!$A4BN$H%0%j%C%IJ,3d5Z$S6I=jNN(B $B0h$+$i!$?'!$6I=j%Q%?!<%s!$%F%-%9%H!$F0$-5Z$S4i$N(B5$B$l$NFCD'$r%b%G%j%s%0$7$F!$:G8e$K(BAdaboost[3]$B%"%k%4%j%:%`$r1~MQ$7!$(BSVM$B$N=PNO%b%G%k$rE}9g$9$k$3$H$K$h$C$F!$:G=*$NE}(B $B9g7k2L$r=PNO$9$k!%$^$?!$0lHLE*$J(BSVM$B$NBe$j$K!$3FF~NOFCD'NL$r=E$_IU$-@~7A%+!<%M(B $B%k$NOB$K$h$C$FE}9g$9$k(BMKL SVM(Multiple Kernel Learning SVM)$B$rF3F~$7!$D>@\G'<17k2L$r=PNO$9$k!%(B
$B?^(B 1: framework
\includegraphics[width=\textwidth]{eps/frameworknew.eps}

5 $B2hA|FCD'NL(B

  • $B?'FCD'(B$B!'(B $B?'FCD'$H$7$F$O!$(B(1)$B%-!<%U%l!<%`2hA|$N(BRGB$B?'6u4V$N3F<4$r(B4$BJ,3d$7!$3FNN0h(B $B$N2hAGCM$NIQEY$rI=8=$9$k(B64$B$l$N
  • $B6I=j%Q%?!<%sFCD'(B$B!'(B $B6I=j%Q%?!<%sFCD'$H$7$F$O!$(BSIFT$BFCD'$rMxMQ$9$k!%FCD'E@Cj=P%"%k%4%j%:%`$K$h$C$F!$FCD'E@$rCj=P$7!$$=$N<~0O$N%Q%?!<%s$r%3!<%I2=$9$k!%FCD'E@$K4X$7$F!$K\8&5f$O?^(B2$B$N$h$&$K!$(BDOG$B!$%i%s%@%`$H%0%j%C%I$N(B3$B$D$N3QEY$+$i8!=P$r9T$&!%6I=j%Q%?!<%s$rMQ$$$F2hA|$r%b%G%k2=$9$k$?$a$K!$(Bbag-of-keypoints$B
    $B?^(B 2: video indexing framework
    \includegraphics[width=\textwidth]{eps/pnt.eps}
  • $B%F%-%9%HFCD'(B$B!'(B TRECVID2008$B$,Ds6!$7$?2;@<%F%-%9%H%G!<%?$KBP$7$F!$(B tf-idf$B%"%k%4%j%:%`$rE,MQ$9$k$3$H$K$h$C$FF@$i$l$?%o!<%I%Y%/%H%k$r;H(B $BMQ$9$k!%(B
  • $BF0$-FCD'(B$B!'(B $B%-!<%U%l!<%`A08e(B0.5$BIC$N%*%W%F%#%+%k%U%m!<$r5a$a!$3Q(B $BEY6u4V$r(B12$BJ,3d$7!$F0$-$NBg$-$5$GEjI<$7$FF@$i$l$?%R%9%H%0%i%`$rMxMQ(B $B$9$k!%(B
  • $B4iFCD'(B$B!'(B $B%-!<%U%l!<%`2hA|$+$iCj=P$5$l$?4i$N?t$r;HMQ$9$k!%(B
  • $B%F%/%9%A%cFCD'(B$B!'(B 6$BJ}8~!$(B4$B<~4|$N%,%\!<%k%U%#%k%?$rMQ$$$F@8@.$5(B $B$l$?(B24$B

6 $BE}9g%"%k%4%j%:%`(B

6.1 $B%V!<%9%F%#%s%0(B

$B%V!<%9%F%#%s%0$OE}7WE*3X=,M}O@$NJ,Ln$K$*$$$F!$J#?t$No$K@-G=$,9b$$<1JL4o$r@8@.$9$k3X=,J}K!<0$G$"$k!%BeI=E*$J(BAdaboost $B$G(B $B$O$"$k $B:#2s$O(BSVM$B$N=PNO$rE}9g$9$k$?$a$K!$(BAdaBoost$B$rMQ$$$k!%DL>o$N(BAdaBoost$B$O%V!<(B $B%9%F%#%s%0%i%&%s%I$4$H$K!$(BSVM$B$r:F3X=,$9$k$N$G!$7W;;%3%9%H$,Bg$-$$!$7k2L(B $B$,IT0BDj$J$I$N7gE@$,B8:_$9$k!%K\8&5f$G$O!$%V!<%9%F%#(B $B%s%0%i%&%s%I$4$H$K:F3X=,$7$J$$$G!$3X=,%G!<%?$N=E$_$N$_$r99?7$9$k(Breweight $B<0$N(BAdaBoost$B2~NIHG$rDs0F$7!$%*%j%8%J%k$N(BAdaBoost$B$HN>J}
$B?^(B 3: $B%V!<%9%F%#%s%0(B
\includegraphics[width=0.8\textwidth]{eps/boosting.eps}

6.2 AP weighted fusion

AP weighted fusion$B$O!$3X=,%G!<%?$rMQ$$$F!$3FFCD'$4$H$K(BAP($BJ?6QE,9gN((B)$B$r7W(B $B;;$7!$(BAP$B$r=E$_$H$9$kE}9g%"%k%4%j%:%`$G$"$k!%K\8&5f$O%V!<%9%F%#%s%00J30!$(B AP weighted fusion$B$b

7 MKL SVM

MKL$B$H$O!$J#?t$N(BSVM$B%+!<%M%k$r@~7A7k9g$9$k$H$-$K!$:GE,$J=E$_$r3X=,$9$k

8 $B

8.1 $B3X=,%G!<%?$H%F%9%H%G!<%?(B

$B $B

8.2 $BI>2A;XI8(B

$B0lHLE*$K!">pJs8!:w$N7k2L$OE,9gN((B(precision)$B$GI>2A$5$l$k!%E,9gN($O@53N@-$N(B $B<\EY$G!"
$\displaystyle precision$ $\textstyle =$ $\displaystyle \frac{$B8!:w$5$l$?E,9g2hA|$N?t(B}{$B8!:w7k2L$N2hA|$N?t(B} \nonumber$  

$B$=$7$F!$J?6QE,9gN($O!$CeL\$9$kKg?t$r(BN$B$H$7!$(B1$B!A(Bn$B0L$^$G$NE,9gN($r(B$pre_{n}$$B$H(B $B$9$k$H!$(B

$\displaystyle AP[precision\;at\;rank\;k]$ $\textstyle =$ $\displaystyle \frac{1}{N}\sum^{N}_{k=1}pre_{k}$  


$B$GDj5A$9$k!%(B

$BK\8&5f$G$O!$7k2L$rI>2A$r9T$&:]$K$O!$(B $B!V?dDjJ?6QE,9gN((B(Inferred Average Precision : infAP)$B!W$rMQ$$$k!%(BTRECVID2006$B$N%G!<%?NL$,KDBg$J$N$G!$I>2A$9$k(B $B$H$-!$%i%s%@%`%5%s%W%j%s%0$G%F%9%H%G!<%?$NLsH>J,$7$+I>2A$r9T$o$J$$!%?dDjJ?(B $B6QE,9gN($r;HMQ$9$k$H!$$h$j@53N$KI>2A$r9T$&$3$H$,$G$-$k!%(B

8.3 $B7k2L(B

$B $B$=$l$>$l$NE}9gJ}K!$N$b$C$H$bNI$+$C$?7k2L$OI=(B1$B$K<($9!%(B $BI=(B1$B$N3F9T$O(B20$B]$KBP$7$F!$(BAdaBoost$B$N2~NIHG!$(B AdaBoost$B$N%*%j%8%J%k%P!<%8%g%s!$(BAP weighted fusion$B!$(BMKL SVM$B$K$h$kE}(B $B9g7k2L$*$h$S(BTRECVID2008$B$N;22C%A!<%`$N7k2L(BinfAP$B$NJ?6QCM!$:G9bCM$r<($7$F$$$k!%0lHV2<$N9T$O3FNs$NJ?6Q$r<($7(B $B$F$$$k!%K\ $BM}O@E*$K!$%*%j%8%J%k$N(BAdaBoost$B$O%i%&%s%I$4$H$K:F3X=,$9$k$N$G!$7k2L$,IT(B $B0BDj$K$J$k2DG=@-$,9b$$!%$3$l$KBP$7$F!$Ds0F$7$?4JC1%P!<%8%g%s$OA4It$N3X(B $B=,%G!<%?$r;HMQ$9$k$N$G!$7k2L$,0BDj$G$"$k!%$7$+$7!$K\8&5f$G

$BI=(B 1: $B7k2L$NHf3S(B
concept$\backslash$fusion smpAda orgAda APw MKL median max
01.Classroom 0.0038 0.0015 0.0218 0.0239 0.008 0.152
02.Bridge 0.0055 0.0123 0.0249 0.0175 0.004 0.117
03.E_Vehicle 0.0017 0.0001 0.0062 0.0015 0.003 0.065
04.Dog 0.0188 0.0145 0.1503 0.1192 0.067 0.271
05.Kitchen 0.0053 0.0161 0.0523 0.0389 0.010 0.165
06.Airplane_fly 0.0301 0.0161 0.0255 0.0181 0.029 0.278
07.Two people 0.0385 0.0201 0.0495 0.0007 0.050 0.174
08.Bus 0.0005 0.0007 0.0034 0.0032 0.004 0.119
09.Driver 0.0232 0.0268 0.0731 0.0682 0.046 0.324
10.Cityscape 0.0544 0.0803 0.1292 0.1138 0.059 0.258
11.Harbor 0.0085 0.0080 0.0110 0.0155 0.007 0.182
12.Telephone 0.0022 0.0023 0.0360 0.0168 0.011 0.136
13.Street 0.0760 0.0808 0.1746 0.0001 0.112 0.413
14.Demonstr 0.0126 0.0206 0.0502 0.0746 0.013 0.233
15.Hand 0.0665 0.0779 0.2035 0.0012 0.092 0.377
16.Mountain 0.0354 0.0401 0.0751 0.1154 0.042 0.246
17.Nighttime 0.1004 0.1358 0.1511 0.1571 0.105 0.323
18.Boat_Ship 0.1125 0.1017 0.1655 0.1330 0.093 0.394
19.Flower 0.0887 0.0912 0.1116 0.1154 0.058 0.161
20.Singing 0.0052 0.0168 0.0873 0.0211 0.013 0.258
mean 0.0345 0.0382 0.0801 0.0528 0.043 0.233

9 $B:#8e$N2]Bj(B

$BK\e$r?^$k$3$H$O$b$&0l$D$N2]Bj$G$"$k!%(B

$BJ88%L\O?(B

1
TREC Video Retrieval Evaluation.
http://www-nlpir.nist.gov/projects/trecvid/.

2
M. Varma and D. Ray.
Learning The Discriminative Power-Invariance Trade-Off.
In Proc. of IEEE International Conference on Computer Vision, pp. 1-8, 2008.

3
RE Schapire, Y. Freund, and RE Schapire.
Experiments with a New Boosting Algorithm.
In International Conference on Machine Learning, pp. 148-156, 1996.