-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 output gets mangled in the Scala Worksheet #185
Comments
Are you running on a non-UTF-8 system? |
Thanks for the prompt answer! Do you agree that using the host configuration would be a bug? I investigated a bit, and before answering your question, I'll give my analysis: Eclipse is correctly configured to use UTF-8 (according to this: http://stackoverflow.com/a/9181068/53974), and that should be enough. Instead, I also need to set Analysis: Since the documented setting is inside Eclipse itself, it seems that what I'm doing is a hack, needed because some code uses the default encoding instead of passing the Eclipse-configured one. Side note/additional issue: line breaking seems very much not Unicode-aware, both in practice:
And maybe happens because this implementation is in terms of bytes — it adds newlines after a certain byte count, but I didn't run anything with debugging: Line 19 in 646a40c
As far as I can tell, no. I'd be happy to try a test of your choice. I'm using OS X 10.9, but almost everything else on my system is handling Unicode correctly. I say "almost" because IIRC some programs (TextEdit) still dare offer me "Mac OS Roman" as default encoding. Regarding Scala REPL, both inside and outside Eclipse, and scala> sys.props("file.encoding")
res4: String = UTF-8
sys.props("file.encoding") //> res0: String = UTF-8 Also, from the prompt:
Finally, I run this program: package charset;
public class TestCharset {
public static void main(String[] args) {
System.out.println(System.getProperty("file.encoding"));
}
} and got this output:
So the default encoding seems to be the right one. But I must be missing something, since -Dfile.encoding=UTF8 made a difference for Eclipse. |
This is still affected by scala-ide/scala-worksheet#185
I am also getting this issue on a UTF-8 system. All files are correctly configured to use UTF-8. The line splitting in the worksheet messes up the output. |
My Scala code (a lambda-calculus implementation) produces UTF-8 output. The worksheet is exactly what I'd want, except that it doesn't cope with UTF-8 program output. The whole project is using UTF-8 as far as I can tell, as the workspace is.
For instance, compare an output fragment, as seen by running the Scala REPL inside Eclipse:
((ℤ → ℤ) → ℤ → ℤ) → (ℤ → ℤ) → ℤ → ℤ)
with what I get in the Worksheet:
((��� ��� ���) ��� ��� ��� ���) ��� (��� ��� ���) ��� ��� ��� ���)
Each Unicode character translates to three question marks because all these characters take 3 bytes in UTF-8 (because they're outside the BMP).
This is with version 3.0.4 of Scala IDE. More precisely:
Scala Worksheet 0.2.3.v-2_11-201405200954-4f7988d org.scalaide.worksheet.feature.feature.group Scala IDE
Scala IDE for Eclipse 3.0.4.v-2_11-201405200946-c46f499 org.scala-ide.sdt.feature.feature.group scala-ide.org
(Plus Scala Search & ScalaTest plugins, I could provide those version numbers if needed).
I've looked at the current source code (which maybe was a bad idea), and it seems that the conversion should be done purely by Eclipse libraries here, and I can't see anything wrong with that:
scala-worksheet/org.scalaide.worksheet/src/org/scalaide/worksheet/runtime/ProgramExecutor.scala
Line 141 in 0281642
The text was updated successfully, but these errors were encountered: