參考文獻
要轉換的 feature vectors 基本上以如下格式儲存:
135 1 0.0328217 0.0100089 0.000778311 0 4.92805e-005 5.75842e-007 1 0.0777054 0.129443 0.0319161 0.0646412 0.0244341 0.0152372 0.00485908 0.0127946 0.0031298 0.00783502 0.0125285 0.00126586 0.0105069 0.00283716 0.00630043 0.0025311 0.00454009 0.00269783 0.00262739 0.00419814 0.00102193 0.00319554 0.00344698 0.00145266 0.00241564 0.00231722 0.00144811 0.000919769 0.00106583 0.00170757 0.00113995 0.00193437 0.00134729 0.000975564 0.00163894 0.00160577 0.0010904 0.000848026 0.00160142 0.00176397 0.00176891 0.00204451 0.000736588 0.000142591 0.000819708 0.000699803 0.000852419 0.000318257 0.0004559 0.00084382 0.00070141 0.000193684 0.000341556 0.00121724 0.00114435 0.000771485 0.00137698 0.00106348 0.001131 0.00066602 0.00113994 0.000994646 0.000689515 0.000134377 0.000611667 0.000816581 7.27596e-012 0.000455511 0.000862932 0.000542742 0.000790653 0.00015376 0.000667005 0.00185994 0.00118799 0.00195221 0.00048036 0.00203206 0.0017036 0.00115241 0.00159665 0.000456289 0.00245261 0.00115743 0.00212574 0.000759982 0.00161923 0.000970768 0.000446351 0.0025218 0.000559381 0.00170117 0.00145245 0.00123497 0.00162679 0.0014485 0.00113292 0.00180706 0.00172861 0.00261046 0.000791285 0.00385214 0.00245815 0.0020356 0.00148777 0.00122579 0.00334875 0.00294233 0.00460947 0.0024289 0.00304412 0.00934439 0.00441411 0.00604666 0.00464733 0.010632 0.0119068 0.0184638 0.0183619 0.0177572 0.0191143 0.027884 0.0404363 0.0691513 0.0750528 0.167004 0.469292 ...
每行代表一個 feature vector ,檔頭的 135 表示每個 feature vector 的維度為 135 。
轉換的 code 是基於 LIACS 的 code 做一點修改,使之能處理小數的特徵值,並可將數張影像的 feature vectors 一次轉換為 HTK Format 。
以下只呈現修改過的部份:
float reverse_float( const float inFloat )
{
float retVal;
char *floatToConvert = ( char* ) & inFloat;
char *returnFloat = ( char* ) & retVal;
// swap the bytes into a temporary buffer
returnFloat[0] = floatToConvert[3];
returnFloat[1] = floatToConvert[2];
returnFloat[2] = floatToConvert[1];
returnFloat[3] = floatToConvert[0];
return retVal;
}
int main(int argc, char* argv[])
{
...
int number, frames = 61;
int i, j; float x;
...
/* Write the binary HTK header file */
HTK_WriteHeader(out,(unsigned int) number, (unsigned int) 1, (short) number * sizeof(float), (short) HTK_USER);
for (j = 0; j < frames; ++j) {
/* Write the data samples with 2 bytes per sample */
for (i=0; i<number; i++)
{
fscanf(in, "%f", &x);
printf("%d: %f\n",i,x);
//x = byte_swap_short(x);
x = reverse_float(x);
fwrite(&(x),sizeof(float),1,out);
}
}
fclose(out);
fclose(in);
/* Test the HTK I/O */
//HTK_test(argv[2]);
}
return 0;
}
frames 表示欲轉換的 feature vectors 數量(即影片 frame 數);
HTK_WriteHeader 的參數 3 為 feature vector 維度 * sizeof(float),參數 4 為HTK_USER 表示自訂特徵類型;
變數 x type 改為 float ,方可接受小數特徵值。
欲驗證輸出的 HTK File Format 是否正確,可使用 HTK Tools 裡的 HList:
- hlist -h filename > log.txt
沒有留言:
張貼留言