The Doujinshi & Manga Lexicon Forum
LOGO
The Doujinshi & Manga Lexicon Forum > General Discussion > Standardizing Romanization / ローマ字化の標準化 <BACK> <1 2 3> <NEXT><Reply> <New topic>
POSTERPOST

MoeMoe
VIP

KI: 1333

【ローマ字化の標準化】

編集者によって異なる方式でローマ字化を行うため、編集が繰り返されかねない。
ここでは、ローマ字化が抱える問題を列挙し、その標準化についての合意形成を目指す。

(1) 基本的な字訳
(2) 字空け
(3) 大文字化
(4) ハイフン
(5) 使用文字種
(6) 巻数

~・~・~・~

【Standardizing Romanization】

Each user does Romanizing in their own way, so it can be happening that one object is edited back and forth.
I'll list problems of Romanizing and attempt to develop the consensus on standardizing of Romanization.

(1) Basic transliteration
(2) Spacing
(3) Capitalizing
(4) Hyphening
(5) Valid characters
(6) Volume numbers

See also
http://www.doujinshi.org/forum/index.php?IIID=1502
2013-04-14 14:08:18<Quote> <Edit>

MoeMoe
VIP

KI: 1333

【俺様ルール】
(0) 作品本体に作者がローマ字を記載している場合、それに従う


(1) 基本的な字訳
し → SHI (× SI)
ち → CHI (× TI)
つ → TSU (× TU)
ふ → FU (× HU)
じ → JI (× ZI)
しゃ → SHA (× SYA)
ちゃ → CHA (× TYA)
ぢ・づ → JI, ZU (× DI, DU)
っち → CCHI (× TCHI)
おう → OU (× O, OO, OH, Ō, Ô) ※下記(5)および「おお」との弁別
えい → EI (× Ē, Ê)
んあ → N'A (× NA)
んな → N'NA (× NNA) ※「っな」との弁別
んま → NMA (× MMA)
助詞は・を・へ → WA, O, E (× HA, WO, HE)


(2) 字空け
・文節・助詞は字空け
・複合名詞間に字空け
・巻数の数字の前に字空け (第ゼロ条との対立時に揺れ)


(3) 大文字化
・自立語の語頭は大文字


(4) ハイフン
接頭語・接尾語はハイフンでつなぐ (揺れ多し)


(5) 使用文字種
http://doujinshi.mugimugi.org/guide/?ID=4
の「Forbidden characters」から、このサイトは「ローマ字欄には標準的なASCII文字のみが使用されるべき」と考えていると推測。
よって、FEPを介さずに入力できる文字のみを使用すべきと考える。
代表的なところでは、特に長音を表記するためのダイアクリティカルマーク付き文字は使用すべきでないと考える。 (Ā や Â)


(6) 巻数
一連の物語が分冊として刊行される場合、実本での表記がローマ数字、漢数字、囲み文字、第○巻、Vol.○、前・後編、上下巻などであるに関わらず、(半角カッコ)内に半角数字で巻数を表記する。
文庫化された場合など、冊数が変わっても成立するものがこれに相当する。
ただし、雑誌など定期刊行物のVol.○は保持する。
一方、定期的に発行されるアンソロなど、1冊で完結している場合はカッコは用いず半角数字で巻数を表記する。


【MoeMoe's standard】
(0) When the author writes Romanization on the book, I follow it.


(1) Basic transliteration
し → SHI (× SI)
ち → CHI (× TI)
つ → TSU (× TU)
ふ → FU (× HU)
じ → JI (× ZI)
しゃ → SHA (× SYA)
ちゃ → CHA (× TYA)
ぢ・づ → JI, ZU (× DI, DU)
っち → CCHI (× TCHI)
おう → OU (× O, OO, OH, Ō, Ô) ※by (5) and to distinguish from "おお"
えい → EI (× Ē, Ê)
んあ → N'A (× NA)
んな → N'NA (× NNA) ※to distinguish from "っな"
んま → NMA (× MMA)
postpositional particles は・を・へ → WA, O, E (× HA, WO, HE)


(2) Spacing
・spacing before the beginning of 文節 (phrases in a Japanese sentence) and 助詞 (postpositional particles)
・spacing between each noun of noun–noun compounds
・spacing before volume numbers (I don't do this consistently when it conflicts with rule zero above)


(3) Capitalizing
・Capitalizing initial letters of 自立語(independent/freestanding words)


(4) Hyphening
・hyphenating prefix and suffix (I do it so inconsistently)


(5) Valid characters
I presume that this DB recommends "only standard ASCII characters into Romanized title field" on the article "Forbidden characters" in the guide
http://doujinshi.mugimugi.org/guide/?ID=4
So I think I should use only characters which we can input without input methods
http://en.wikipedia.org/wiki/Input_method
A representative example which I should not use is the characters with diacritical mark (like Ā or Â).


(6) Volume numbers
When one series of manga are published in installments, I write volume numbers with Arabic numerals in (parentheses) even if what kind of notation is used on actual book like Roman numerals, Kanji numerals, 第##巻, Vol.##, 前・後編, 上下巻.
It is applied when the manga can be published in any number of books like original edition / reprinted bunko edition.
But I keep Vol.## to periodicals like magazines.
On the other hand, I write only Arabic numerals, no (parentheses) to series which is concluded in each volume like periodical anthology comics.
2013-04-14 14:08:36<Quote> <Edit>
Lepetit89
KI: 77
First of all, thanks a lot for posting this. A little more than a year ago, I already requested that we agree on standards regarding romanization, but there was barely any interest.

However, what was pointed out was the fact that it's not possible to enforce standards 100%, and I think doing so would actually be detrimental to the progress we make here.

Nonetheless, what I suggested back then was that we simply use these standards as guidelines - if someone with the necessary rights confirms a romanization that does not match the standards we agree on, then another user should be allowed to change the entry in question and bring it up to the standards without any "edit wars" ensuing.

If we could incorporate that into the rules, then we would, even though there would still be entries not matching the standards, steadily move towards a database that consists of standardized entries for the most part.

I, for one, would wholeheartedly welcome such a change as it would make both creating and editing entries as well as searching for them much easier and less awkward. I tend to edit romanizations that do not match my own standards, however, I'm always concerned about this seeming rude to someone.

If we can agree on standardizing romanizations being a good idea, then we should see if the suggestions by MoeMoe regarding what standards we enforce are acceptable and just use that or make other suggestions.

If someone with authority gives a definite okay that standardized romanizations are possible here, then we should probably decide on the standards as soon as possible, as I currently have a lot of entries awaiting confirmations that are issues exactly because of the lack of standards; though, looking over the guidelines above, I wouldn't mind simply accepting those; any standard is better than none, and I find myself agreeing with most of the guidelines written down above.

2013-04-14 20:36:17<Quote> <Edit>
gilic
KI: 224
I appreciate your effort to create a standard to use. Some practical examples for every point would be nice and maybe a list of frequently recurring kanji with the accompanying proper romanization.
2013-04-15 17:58:48<Quote> <Edit>
MoeMoe
VIP

KI: 1333

Good to hear that from you two.

Then I go straight to the point, you guys have different standard for capitalization.
Talk each other and develop some consensus.
Until then, I can't confirm your edit cos if I do, that means I choose one of them.

I just state my opinion but I don't want to force it on anybody.
2013-04-18 02:55:19<Quote> <Edit>
Lepetit89
KI: 77
Wonderful, we're making progress, I'm getting all giddy!

Anyway, consensus can be difficult to achieve, so I'll just go for a compromise right away - I suggest we just use the guidelines as written above.

While they're a little different from my university's standards, they're still close enough to Hepburn Romanization to not bother me too much. Furthermore, since MoeMoe is the one who has to confirm the large majority of entries added, I think it would be helpful if we used the system he is most familiar with - which would be his own, naturally.

Furthermore, not all cases of romanization are that clear and I'm not sure if we can cover everything with the guidelines posted here. Since almost all of the communication regarding romanization issues takes place between the editor and the person who confirms the edit, I suggest we add problematic cases to the guidelines as we encounter them in order to keep them up-to-date.

For that purpose, I suggest we either use this thread or a separate one to ask questions or to discuss possible issues.
2013-04-19 18:11:12<Quote> <Edit>
gilic
KI: 224
Lepetit89 it seems your are studying Japanese and MoeMoe is a native speaker, so I'm following your decision. I myself don't speak it and use different tools and common sense when I'm trying to romanize short kanji titles.

My goal is to achieve a consistent style so stuff which belongs together is in order and looks the same.

Anyway some comments/questions:

(0)
This includes errors by the author on the cover, different capitalization between issues of a series and changes in the writing style of the numbering?

(3)
Does this mean pretty much every word should be upper case? I think I'm doing that already, but I tend to write particles lower case.

(5)
Do we try to replace invalid characters with similar ASCII characters or just use a space? I prefer the latter.

(6)
I personally like roman numbers in titles, but I'm OK with converting them to Arabic numbers.

Some titles use Greek numbers (α,β,...), do we write them out (alpha,beta,...) or convert them? I think most of the time they are used less for numbering and more as part of the name so I prefer again the latter.

(7) new
幻想郷(Gensou Sato) - Gensoukyou is the name of the world in the Touhou franchise. I have seen different writings across the database. For this case and similar ones I'd like to decide in this thread for an accepted writing.

Thanks for reading and again the efforts.

PS. I'm German so in the worst case translating goes something like this: kanji->romanization->english->german ^^
Oh and I'd like to keep the discussion in this thread so everything relevant is in one place.
2013-04-20 13:42:07<Quote> <Edit>
MoeMoe
VIP

KI: 1333

【Romanization for Japanese users】

We Japanese can read Japanese so we don't need Romanized title at all.
We just... if I may say so, don't care a straw.

I think it is non-native Japanese speaker that need Romanization.
So if you say "I want Romanization like this", I'll follow it (or state my opinion).

【MoeMoe's standard for confirming】

I confirm the Romanization unless it's obvious wrong.
When I get a sense of discomfort as a native Japanese speaker, I ask "are you sure?" which means "do you have any authorized source?"

If you change, in other word, fix the Romanization, I demand contributor to prove that your change is more relevant.
cf) http://doujinshi.mugimugi.org/forum/index.php?IIID=1388



It's not the purpose of this thread to force my standard on anybody.

What perplexes me is CHANGING of the Romanization.
If you are the first one who Romanizes the title of the book, I'd confirm it even it doesn't follow my, or any kind of standard (unless it's obvious wrong).

Or should I probably confirm all changing...?
That's way much easier for me.
2013-04-20 23:56:19<Quote> <Edit>
MoeMoe
VIP

KI: 1333

For gilic

(0)
> This includes errors by the author on the cover

Yes.

> different capitalization between issues of a series and changes in the writing style of the numbering?
I can't say I Romanize consistently all the time.
If I Romanize one hundred books now, I don't think I can Romanize them 100% same 3 months later.

But I think I'd try to unify the notation.

(3)
> Does this mean pretty much every word should be upper case? I think I'm doing that already, but I tend to write particles lower case.


"write particles lower case", that's what I do.
I wrote hyphenated prefix and suffix in lower case, too.


(5)
> Do we try to replace invalid characters with similar ASCII characters or just use a space? I prefer the latter.


ex 1) Circle 黒 vs. 黒。
http://doujinshi.mugimugi.org/browse/circle/21871/
http://doujinshi.mugimugi.org/browse/circle/5190/

ex 2) Circle TRICK STAR vs. TRICK★STAR
http://doujinshi.mugimugi.org/browse/circle/26280/
http://doujinshi.mugimugi.org/browse/circle/17071/

I think replacing with similar ASCII is better.


(6)
> I personally like roman numbers in titles, but I'm OK with converting them to Arabic numbers.


Roman numbers are platform dependent characters.
I want to ask everyone especially admin, are they OK to use?

I use Arabic numbers for ordering.
Non-VIP users can't change VIP SETTINGS
http://doujinshi.mugimugi.org/settings/
so most of the users list objects in Default Object Order, title-ascending.

case 1) Roman numbers â… , â…¡, â…¢...
→ platform dependent characters. I don't want to use them for now.

case 2) numbers in circle â‘ , â‘¡, â‘¢...
→ platform dependent characters. And no characters more then 21.

case 3) Kanji numbers 一, 二, 三, 四, 五, 六, 七, 八, 九, 十, 十一, 二十, 百
sorting result (original title-ascending) → 一, 九, 五, 三, 四, 七, 十, 十一, 二, 二十, 八, 百, 六
(1, 9, 5, 3, 4, 7, 10, 11, 2, 20, 8, 100, 6)

case 3) different writing of kanji numbers 壱, 弐...
→ same problem as 一, 二...

case 4) 上巻, 下巻 / 前編, 後編
→ same problem as 一, 二...

case 5) with/without alignment of the numbers of digit
1, 2, 3, 10, 11, 20, 100
→ 1, 10, 100, 11, 2, 20, 3

001, 002, 003, 010, 011, 020, 100
→ 001, 002, 003, 010, 011, 020, 100

> Some titles use Greek numbers (α,β,...), do we write them out (alpha,beta,...) or convert them?.

Oh, I never thought that they are numbers.
As for me, I write them out.
2013-04-21 01:26:07<Quote> <Edit>
Lepetit89
KI: 77
MoeMoe Wrote:
【MoeMoe's standard for confirming】

I confirm the Romanization unless it's obvious wrong.
When I get a sense of discomfort as a native Japanese speaker, I ask "are you sure?" which means "do you have any authorized source?"

If you change, in other word, fix the Romanization, I demand contributor to prove that your change is more relevant.
cf) http://doujinshi.mugimugi.org/forum/index.php?IIID=1388



It's not the purpose of this thread to force my standard on anybody.

What perplexes me is CHANGING of the Romanization.
If you are the first one who Romanizes the title of the book, I'd confirm it even it doesn't follow my, or any kind of standard (unless it's obvious wrong).

Or should I probably confirm all changing...?
That's way much easier for me.


I think you're missing the point of our intentions - it's not the purpose of this request to "force" your standard on others - the purpose is to finally establish a standard of some kind, any kind - I'd translate the romanized titles into Esperanto if that were the standard (not that I speak the language, but you get my point).

I need that standard so I can simply change a romanized title, refer to this thread if complaints come up, and be done with it. I spend a lot of time on each entry I edit and I want the entries I take care of to actually look good.
However, if I have no standards I can simply refer to, every edit I make to standardize my entries winds up turning into peace negotiations where I have to "develope a consensus" - that is not productive at all and it doesn't help the database in the least for the following reason:

Let's assume we have a series of Doujinshi with 5 installments, all JP titles are identical + the respective entry's number, all entries have romanized titles.
The first one isn't capitalized at all, the second one is written in caps-lock, the third one has the first letter of each word capitalized, the fourth one only the first letter of the first word and the fifth one doesn't even need to have anything done to it, as you already understand my problem - the end result is that it looks terrible, plain and simple.

When I look up a circle, I'd prefer having all titles in alphabetical order and standardized - it makes looking them up easier and looks good. But the scenario outlined above is an insult to my eyes as well as my sense for order.

I'm not changing titles for the fun of it or because I want to disrespect others or prove them wrong. I respect everyone involved in this project a great deal, but I think that, no matter what we do, at the end of the day the results of our work should still have an aesthetical appeal to them. It's not like everyone would have to follow the standards down to a T - they can use caps-lock all they want. All I want is the right to clean up after them, for the sake of the database, as I believe that only a clean database is a good database.
However, for that purpose, I need those standards.


2013-04-21 09:48:11<Quote> <Edit>
MoeMoe
VIP

KI: 1333

So the standard which you want to be applied is "uniformity between series > notation of original" ?

> I believe that only a clean database is a good database.
I think quality of database is about minute and detailed mapping of objective facts.
I'm not saying a clean is not important and agree with uniformity but I think it's not maximum priority.
2013-04-22 16:27:47<Quote> <Edit>
Lepetit89
KI: 77
MoeMoe Wrote:
So the standard which you want to be applied is "uniformity between series > notation of original" ?

> I believe that only a clean database is a good database.
I think quality of database is about minute and detailed mapping of objective facts.
I'm not saying a clean is not important and agree with uniformity but I think it's not maximum priority.


Ah, no, sorry, it's not that I want uniformity between series, the author still decides; it was just an exaggerated example. It could have been five completely independent works by the author; the point was that five successive entries, each using different standards for romanization, look terrible.

However, I also agree that what I'm asking for here is not maximum priority, no doubt; nonetheless, it's not like our other goals and maintaining a clean database are mutually exclusive. They aren't, not in the least.

Those who don't care won't have to do anything, it's not like people can't romanize in whatever fashion they wish.
As for myself, I'm still only saying that I just want the right to clean the entries in question. Then I can get to cleaning my Doujinshi / circles that don't match the standard and just take it from there.
This is merely about providing those who would like to do something about standardized entries with the right to do so.

We're not enforcing anything, we're not telling people they mustn't do it either; we're simply letting the people who do care clean up the database.
2013-04-23 07:40:44<Quote> <Edit>
gilic
KI: 224
Sorry took me a while to reply.

(5)
> Do we try to replace invalid characters with similar ASCII characters or just use a space? I prefer the latter.

ex 1) Circle 黒 vs. 黒。
http://doujinshi.mugimugi.org/browse/circle/21871/
http://doujinshi.mugimugi.org/browse/circle/5190/

ex 2) Circle TRICK STAR vs. TRICK★STAR
http://doujinshi.mugimugi.org/browse/circle/26280/
http://doujinshi.mugimugi.org/browse/circle/17071/

I think replacing with similar ASCII is better.


OK. I see for circles and artist this would be better, but what do you think about book titles? It would be a lot easier to use a space and it isn't that easy to find a proper replacement for an invalid character.

(6)
> I personally like roman numbers in titles, but I'm OK with converting them to Arabic numbers.

Roman numbers are platform dependent characters.
I want to ask everyone especially admin, are they OK to use?

I use Arabic numbers for ordering.
Non-VIP users can't change VIP SETTINGS
http://doujinshi.mugimugi.org/settings/
so most of the users list objects in Default Object Order, title-ascending.

case 1) Roman numbers â… , â…¡, â…¢...
→ platform dependent characters. I don't want to use them for now.


I agree, till now I've used I,V and X and not the correct characters.

case 2) numbers in circle â‘ , â‘¡, â‘¢...
→ platform dependent characters. And no characters more then 21.

case 3) Kanji numbers 一, 二, 三, 四, 五, 六, 七, 八, 九, 十, 十一, 二十, 百
sorting result (original title-ascending) → 一, 九, 五, 三, 四, 七, 十, 十一, 二, 二十, 八, 百, 六
(1, 9, 5, 3, 4, 7, 10, 11, 2, 20, 8, 100, 6)

case 3) different writing of kanji numbers 壱, 弐...
→ same problem as 一, 二...

case 4) 上巻, 下巻 / 前編, 後編
→ same problem as 一, 二...

case 5) with/without alignment of the numbers of digit
1, 2, 3, 10, 11, 20, 100
→ 1, 10, 100, 11, 2, 20, 3

001, 002, 003, 010, 011, 020, 100
→ 001, 002, 003, 010, 011, 020, 100


So the most convenient way to achieve the correct sorting for most of the users would be to only use Arabic numbers and pad zeros when necessary.

> Some titles use Greek numbers (α,β,...), do we write them out (alpha,beta,...) or convert them?.

Oh, I never thought that they are numbers.
As for me, I write them out.


Just to correct myself α,β,... are letters of the Greek alphabet, but like I said they can also be used to count.

I'm not changing titles for the fun of it or because I want to disrespect others or prove them wrong. I respect everyone involved in this project a great deal, but I think that, no matter what we do, at the end of the day the results of our work should still have an aesthetical appeal to them. It's not like everyone would have to follow the standards down to a T - they can use caps-lock all they want. All I want is the right to clean up after them, for the sake of the database, as I believe that only a clean database is a good database.
However, for that purpose, I need those standards.

This is merely about providing those who would like to do something about standardized entries with the right to do so.

We're not enforcing anything, we're not telling people they mustn't do it either; we're simply letting the people who do care clean up the database.


Totally agree with you here Lepetit89.


One last question from me to you two (or anyone other who also cares):
What's your opinion of using different numbering schemes in the Original(keep what the artist decided on) and the Romanized(convert everything to Arabic) field?
2013-04-28 17:30:39<Quote> <Edit>
Lepetit89
KI: 77
I'm sorry if this might seem a little bold, but I'd prefer actually solving this problem for once; I do not intend to forget about it, so I'd appreciate it if we could keep working on a solution.
2013-05-05 08:40:18<Quote> <Edit>
Lepetit89
KI: 77
Well, I suppose not replying also tells more than enough about what I need to know.

Given the lack of actual participation in this topic, I wondered if I might have invested all of these hours into the wrong database, but I suppose I'll just entirely avoid romanizations unless the author himself provides them.

I was trying to establish standards with good intentions, but to see that these are met with a complete and utter lack of interest is disappointing, to say the least. I wouldn't have minded if I had been convinced that standards would have caused issues or led to problems, but the discussion was just dropped in the middle, completely ignoring any arguments in favor or even against the standardized romanizations.

As I said, I find this disappointing, as I prefer a friendly and collegial atmosphere while working on projects like these; the, frankly speaking, disrespectful way this issue was being handled more than implies that this is simply work to everyone involved, and strictly that.

Anyway, I'll leave it at that and just avoid any aspects of the database that involve arguments, discussions and/or communication. The database still has a tremendous worth to me and I enjoy contributing using the information my collection provides me with, so I'll definitely continue contributing, but right now the only conclusion is that it was wrong of me to expect any kind of collegial or social involvement in the first place. As a result, I'll change my stance accordingly to avoid future disappointment.
2013-05-17 10:25:48<Quote> <Edit>
MoeMoe
VIP

KI: 1333

to gilic

My list of conversion of Fullwidth to ASCII

★, ☆ (star) → * (asterisk)
~ (wave) → ~ (tilde)
× (x-mark) → x or X
・ (middle dot) → . or - (depends on context)
。 (Japanese period) → . (period)
、 (Japanese comma) → , (comma)
→ (right pointing arrow) → -> (minus + greater-than sign)
← (left pointing arrow) → <- (less-than sign + minus)


to Lepetit89

I didn't get enough reply to NOTICE I wrote to you.
I asked "you have to prove your Romanization is better" when you CHANGED the Romanizations but what you did was showing your standard. (capitalize English word)
I wanted to hear from you was WHY it's better.

I accepted your edit when you showed "it's the Romanization which the author wrote on his website".

And don't mind even you don't have any reply.
See my posts on the forum. So many threads I didn't get any replying. And so many NOTICE I didn't and don't get any replying. It's so everyday occurrence in web community.
2013-06-02 19:13:07<Quote> <Edit>
gilic
KI: 224
Thanks for the reply, I'll try to use your conversions they make sense to me.

It certainly would be easier to work on the db if the website would provide more tools/support in doing so.
2013-06-09 15:20:57<Quote> <Edit>
Akari
Administrator

KI: 153

Although I didn't read each piece of the standards proposed, I think the standardizations that I see are good. However, I think all of us should remember that there is so much work to be done in this project and we should not be so picky and critical about romanizations when compared with the whole huge scope of what this site is about. For example, I'd rather people who submit data use their energies more towards properly identifying the authors and circles of a book rather than concern themselves with aspects of minor importance such as how to, for example, deal with ☆. And I would rather someone even romanize し as SI (than SHI, which I use and I think is superior to SI) than to not romanize at all. Afterall, SI is not an incorrect romanization of し, it is simply not as good. And it is better to tolerate the worse than to settle for nothing. (A data-submitting person who first starts may start with SI and later change to SHI after becoming more experienced on this site or another person can improve the romanization.)

>>Lepetit89,

Please do not allow yourself to get excited so much about the fine-points of romanization. Your work on this site is excellent and it is not good to become frustrated over such a small matter as this. It is sad to think you will stop work simply over such a small matter as this as it turns something small into something large.


_______________

As for me, I do not care if someone changes my submissions to this site as long as it is an improvement to what I've made. For example, consider what MoeMoe wrote,

"★, ☆ (star) → * (asterisk)
~ (wave) → ~ (tilde)
× (x-mark) → x or X
・ (middle dot) → . or - (depends on context)
。 (Japanese period) → . (period)
、 (Japanese comma) → , (comma)
→ (right pointing arrow) → -> (minus + greater-than sign)
← (left pointing arrow) → <- (less-than sign + minus)"

I've romanized many titles and did not abide by these standardized rules. First of all, this discussion did not happen until recently after I've already done many submissions. And I did some recent romanizations that didn't even obey these rules. For example, I substituted ~ with - instead of ~. But if someone wants to modify my submission and change - into ~, then I do not mind. I do not care enough myself to do this work. I told another person on this forum a similar thing: there are many ways to contribute to this database. I think I know more than many non-Japanese about authors and circles. I think that this is my main strength. So I want to continue with my focus on this aspect of data above all else. Other people will know more about parodies or contents than me. For example consider the various futanari related contents. We have futanari, new half, transsexual and maybe other related contents. I'm not sensitive about the differences among these, and I don't care much. If I mark something as futanari another person can change it later to new half if new half is a more accurate content. Similarly, I may not add a parody and another person can add this later. This site is a group-effort and we should do what we can specialize in and not worry so much about perfection. It is not as if we have only a small amount of data to work with and the basic, rough data is finished. It's not hardly this way. There is so much basic, rough work to do that we should not stop on matters of small importance.

For me, I'm going to read over this guide soon and try to implement these rules into my romanizations but if I make some mistakes or forget somethings, I'm not going to worry much. Another person, who wishes to improve my submissions, can fix it later. I've already provided most of the work and so it will be a slight amount of work added to a bulk. Sometimes, I also improve other peoples' submissions by, for example, adding some authors missed to a book. But the original person may have done the bulk of the work. I only do a small bit of revision. The first person to submit data was not a worthless effort but was a good first step. My work does not perfect the data. Another person may come later and improve the first person's work and my work. This process of constant revision should not bother anyone but should let people feel better that even if they make a mistake or omission, those can be corrected later by another person.
2013-06-15 01:14:21<Quote> <Edit>
Lepetit89
KI: 77
MoeMoe Wrote:
to Lepetit89

I didn't get enough reply to NOTICE I wrote to you.
I asked "you have to prove your Romanization is better" when you CHANGED the Romanizations but what you did was showing your standard. (capitalize English word)
I wanted to hear from you was WHY it's better.

I accepted your edit when you showed "it's the Romanization which the author wrote on his website".


I think we've encountered a slight misunderstanding. The reason I did not continue the conversations in the NOTICE-fields was, as I hinted at above, that there is little point in going through each edit and working out something only applicable to an individual case when we could just settle on standards here and then apply them to the problem-cases.


And don't mind even you don't have any reply.
See my posts on the forum. So many threads I didn't get any replying. And so many NOTICE I didn't and don't get any replying. It's so everyday occurrence in web community.


Frankly speaking, I understand that this is a common problem, but that's all the more reason to improve our cooperation in that regard. Everyone's hard at work for the sake of the database, but it's obvious that there are areas that need improvement. However, without conversation, we won't be able to make any improvements, as those are topics that need a consensus. Reaching a consensus, however, isn't overly feasible if there are only 1 or 2 people involved in the conversation.

I realize that I'm just as guilty of this, as I also haven't paid much attention to the forums unless something of interest to me was addressed - nonetheless, I think it would be of benefit to our success with the database if we also worked on improving the interaction between one another. Working on the database would be more fun and also easier as we could actually find solutions for issues.

Either way, I apologize for my frustration above. Some things take time, and if this is the stance towards conversation, then I should try to work on it myself instead of throwing the towel, there's much to be gained from not giving up on this - thank you for pointing that out, Akari.
2013-06-17 20:36:50<Quote> <Edit>
gilic
KI: 224
To get everything back into motion I created an etherpad here:

http://board.net/p/mugimugi_standardizing_romanization

I think it's easier when there is a file which everyone can edit and work on.
2013-07-07 15:54:30<Quote> <Edit>
Lepetit89
KI: 77
Sorry for the delay, I was going to ask this much earlier, but I was rather occupied due to other matters.

Nonetheless, regarding 5), valid characters - I do not understand the purpose of invalid characters.

The original title field has no restrictions at all as to which characters may be used. This makes sense, since it is supposed to represent the title given by the author accurately. This includes capitalization, stylization of any kind as well as obscure characters.

However, when romanizing titles, most of these stylistic devices are converted appropriately - this is further emphasized by official romanizations taking precedence over everything else.
The exception to this rule are the characters that are deemed invalid.

That's where it becomes problematic.

Assuming a person capable of speaking/reading Japanese is looking for a certain Doujinshi, they would probably use the original title.
A person not capable of reading or writing Japanese would supposedly use the romanized title (which does not make much sense, but more on that later).

In the end, both the romanized title as well as the original title are used for searches. However, whether you're Japanese or European or American, you do not have access to all characters without using ASCII-codes - in that case, why would you ban certain characters from use for romanizations when all of them may be used for the original title?

Japanese keyboards and western keyboards encounter the same problems - due to that, it seems counter-intuitive to restrict use of ASCII-characters when both the original title as well as the romanized title are supposed to be accurate representations of the title.

There's more I want to address in regards to romanizations, but let's go through this issue first and possibly work out a solution/compromise.
2013-08-05 17:50:54<Quote> <Edit>
thecert
KI: 72

it seems counter-intuitive to restrict use of ASCII-characters when both the original title as well as the romanized title are supposed to be accurate representations of the title.

The problem here is defining "accurate representation." There's not necessarily one correct romanization of a given title. Under current guidelines ( http://doujinshi.mugimugi.org/guide/?ID=4 ), the romanization or translation of a transliterated word is permissible. So, for example, ブック can be romanized as "bukku" or " Book".

An especially problematic character among the characters currently labeled "forbidden" or "invalid" is the raised dot. This is often used to separate words (usually transliterated katakana words). Arbitrarily chosen example: http://doujinshi.mugimugi.org/book/10745/Suzako-A-La-Mode/

Romanized title: Suzako A La Mode
Original title: スザ子・ア・ラ・モード

In my opinion, the person doing the romanizing made a good call. The dots are just word dividers; they don't have meaning. If I'd seen this character sequence in running text, it never would have occurred to me to keep the dots when romanizing it.

The hollow star is used in this one: http://doujinshi.mugimugi.org/book/116691/

Original title: イケメン☆ラブ☆アタック!2

The romanized title field is blank, but the cover image shows the creators' romanization. It has spaces between words, not stars.

I suppose ultimately it comes down to deciding whether characters that aren't normally used in regular Latin-character text are being used in some meaningful way or simply as ornaments to add visual appeal. If the creators haven't provided a romanization, we can't know. Personally, I think the romanized title field would be OK containing only regular Latin typographical characters: letters, numbers, and the kind of punctuation marks that appear in regular text (period, comma, colon, quotation marks, apostrophe, hyphen, dash, asterisk). The raised dot character doesn't add meaning as such; it's a word divider, like a space. The solid star and hollow star can function the same way.

Or maybe the underlying question is: What is the function of the romanized title field? If this were my decision (and it isn't!), I'd say one function is making searching as simple as possible for people who are inputting with a regular roman keyboard. I want these people to be able to find スザ子・ア・ラ・モード by searching for "Suzako A La Mode," not "Suzako・A・La・Mode".

What do we gain when we keep "forbidden" characters in the romanized title field? What do we lose when we omit them?
2013-08-07 17:56:44<Quote> <Edit>
Lepetit89
KI: 77
thecert Wrote:
it seems counter-intuitive to restrict use of ASCII-characters when both the original title as well as the romanized title are supposed to be accurate representations of the title.

The problem here is defining "accurate representation." There's not necessarily one correct romanization of a given title. Under current guidelines ( http://doujinshi.mugimugi.org/guide/?ID=4 ), the romanization or translation of a transliterated word is permissible. So, for example, ブック can be romanized as "bukku" or " Book".


Excellent example, or rather, issue, you mention here. The guidelines may say that both romanizations are acceptable, but the way the guidelines are being "enforced" usually differs greatly from what the guidelines say. For instance, I would prefer "bukku", but I've never had anything confirmed if I didn't convert a Katakana-spelled foreign term into its English counterpart.

As a result, which romanization is "correct" mainly depends on who confirms a change - some will confirm what others won't and vice-versa, at least from my experience.

Nonetheless, if we were to aim for a correct representation, then using the same characters used in the original title would make sense - assuming we have something like a star there.

The example you quote (the raised dot) on the other hand, is a very good example of cases that would possibly not be represented in the best fashion.

Since the raised dot is supposed to divide words in order to improve readability, it's probably not supposed to be considered a stylistic device of any kind in your example. However, we also cannot say that with 100% certainty - it's the same problem as romanizations which cannot be done unless you ask the author himself.

In the end, we're stuck in a situation in which we cannot tell for sure which is best as this is a matter of interpretation - in this case, I generally deem it best to go with what we see, so I would include the dots.

This mainly leads me to the same question as you - what purpose are romanizations supposed to serve?

Assuming we want them to be used for easy searches, then we also need to consider whether they can be used for searches at all if the characters used differ from what the title actually says since one would have to guess as to what the romanization could actually look like if certain characters are deemed invalid.

The important question is - what kind of person looks for entries using the romanized title?

Someone who can read the Japanese title would use the Japanese title to look for an entry.
Someone who cannot read the Japanese title actually wouldn't even be able to look for the entry. How are they supposed to even find out what to look for?

This actually brings me to an issue I've been pondering for quite some time, as I've been wondering who it is that benefits from romanizations.

As I said before, people who can read Japanese don't need them.
People who can't read Japanese cannot use them as they cannot romanize the title of a Doujinshi they own to look for them.

That leaves us with people who want to organize "collections" of digital Doujinshi (cf. http://doujinshi.mugimugi.org/forum/index.php?IIID=1477 ), which, as I would assume, in more than 90% of all cases are collections of pirated Doujinshi.

So basically, we romanize titles to improve the comfort of people who illegally distribute Doujinshi. I can't exactly say I'm ecstatic about that.
Unless I'm missing something (if I do, by all means, go ahead and tell me, I'd love to hear that I did not waste hours upon hours for the benefit of pirates), making searches for Doujinshi easier is not something that needs to be considered as barely anyone would look for them using the romanized title. Due do that, I still think that, if we romanize at all, we should stick to the original title as closely as possible.
2013-08-10 09:02:24<Quote> <Edit>
MoeMoe
VIP

KI: 1333

(Fact 1) Guide says that some characters should not be input in Romanized title.

♪ × ・ ★ ~ ∀ ♯ ☆ ○ 。


(Inference 1)

It's not reasonable if
♪ is NG but ♫ is OK
~ is NG but ∽ is OK
♯ is NG but ♭ is OK
â—‹ is NG but â—¯ is OK
。 is NG but 、 is OK

So there might be NG characters other than the list above.


(Fact 2)

Guide says "They must be replaced with SPACE or similar looking ISO character."


(Inference 2)

So the point is it's ISO character or not.


(Inference 3)

Romanized title becomes a part of URL.
I don't know but there might be a problem when URL contains other than ISO characters.

cf) Percent-encoding from Wikipedia
http://en.wikipedia.org/wiki/Percent-encoding


【about book → bukku method】

(admin's opinion)
http://doujinshi.mugimugi.org/forum/index.php?IIID=1052


(Standardizing)
Transliterating English words to Kana is so inconsistent.

ex) sweet
・スイート
http://doujinshi.mugimugi.org/search/object/?Q=s&sn=%E3%82%B9%E3%82%A4%E3%83%BC%E3%83%88
・スウィート
http://doujinshi.mugimugi.org/search/object/?Q=s&sn=%E3%82%B9%E3%82%A6%E3%82%A3%E3%83%BC%E3%83%88

How to transliterate when the title contains "sweet" without Kana writing? "Suiito" or "Suwiito"?
ex)
http://doujinshi.mugimugi.org/book/628069/

・Other unstandardizable cases
V sound → ヴ or ブ
OU sound → オウ or オー
EI sound → エイ or エー
2013-08-10 10:05:55<Quote> <Edit>
Lepetit89
KI: 77
Okay, I certainly agree with the point listed under interference 3) as well as what you say about the romanization of English words.

In that case, however, we should probably see to it that the guides are updated accordingly.

Nonetheless, I still wonder under what circumstances someone would look for a title using the romanized title. Or rather, what kind of person would do that.
2013-08-13 17:44:50<Quote> <Edit>
The Doujinshi & Manga Lexicon Forum > General Discussion > Standardizing Romanization / ローマ字化の標準化 <BACK> <1 2 3> <NEXT><Reply> <New topic>

mugiBB (C) (V0.1)

Page was generated in 0.0362 seconds !