You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When making a compiled user-dictionary directly with a model file, but without '-a' option, user-specified fields are taken into the compiled user dictioinary as it is.
I expect the '-a' option also works like this. Because it is useful for the following case.
you want to control some limited records' costs/ids, but doesn't want to do that for the remaining part
and you want to see automatically-assigned values for the remaining parts in a CSV file
then, the generated CSV file can be easily integrated into system dictionary
mecab-dict-index '-a' option overwrites user-specified costs/ids unexpectedly.
An expected behavior of the '-a' option is that blank fields are filled out automatically, but the user-specified ones are kept as it is.
Below is an example.
Prepare foo.csv file for making user dictionary like as follow.
田町,,,3000,名詞,固有名詞,地域,一般,,,田町,タマチ,タマチ
Execute the following line (before doing this, you need to get ipadic dictionary and its model file)
mecab-dict-index -m mecab-ipadic.model -d ipadic -u foo2.csv -f euc-jp -t euc-jp -a foo.csv
then, you get the following output in foo2.csv
田町,1293,1293,8067,名詞,固有名詞,地域,一般,,,田町,タマチ,タマチ
As you see, the user-specified cost, 3000, is overwritten by 8067.
An expected output, in this case, is;
The text was updated successfully, but these errors were encountered: