Empirisoft Support

    Welcome to Empirisoft Support
Results 1 to 4 of 4

Thread: problem in data recording from custom/html items

  1. #1
    Join Date
    May 2008
    Posts
    2

    problem in data recording from custom/html items

    hello,

    if this has been discussed elsewhere in the forum I did not find it after an extensive search. i have found a solution to this conundrum now maybe this can save someone else who wrestles with the same thing some trouble.

    when using an html file in medialab and collecting several variables from a form in that file (e.g., using dummy variables), medialab kept recording a null character at the end of the value of the last variable in the form (regardless of which variable is last in the form definition, and regardless of where it is written in the data file depending on its alphabetical name). this happened for both text/csv files as well as spss files. the result was a royally screwed up spss file with funny characters in it and text files of which i had to always clean out these null characters first before being able to read them into, say, R for analyses. it seems that this null character originates from the post data that medilab intercepts from the html file and then further processes. these null characters show up in a decent editor (notepad++) as marked in this image:

    null character.jpg

    i have now found out that the the problem was with the internet explorer (who would have thunk?) running on my own and our lab machines, which displays the html pages within medialab. these ies used a special character encoding as a default encoding leading to these null characters, for i finally got rid of the null characters in the recording of the data when i explicitly indicated another character encoding for the form alone or in the head of the html file, respectively, like so:

    form:
    PHP Code:
    <form name="form1" method="post" accept-charset="ISO-8859-1"
    or
    head:
    PHP Code:
    <meta http-equiv="content-type" content="text/html;  charset=ISO-8859-1"
    so if you're getting weird data recording check if your ie uses f%§$"&d up non-standard character encodings and make it explicit that you do not want that. in the form or for the entire html file.
    possibly try out different encodings until the data recording goes as smoothly as mine does now.

    may some deity strike down upon windows' own, non-standard character encodings!

    best,
    johann

  2. #2
    Join Date
    Feb 2006
    Posts
    13

    Thanks!

    Thanks for figuring this out.
    Danke Johann!

  3. #3
    Join Date
    Sep 2010
    Posts
    1

    Same problem

    I seem to have the same problem. R stops reading my csv data at the strange character. The proposed solution however did not help in my case. I tried character set definitions iso-8859-1, windows-1252, utf-8 and us-ascii without success. I am using IE 7.

    Any other ideas?


    Edit: As a workaround, I now use GNU sed to remove the characters. In case someone has the same problem and uses R, here's how I do it (be sure to adapt the paths):

    Code:
    mydataframe <- read.csv(pipe("C:/Programme/GnuWin32/bin/sed.exe -e \"s/\\c@//g\" \"C:/path/to/questionnaire.csv\""), header=TRUE, sep=",", quote="", stringsAsFactors=FALSE, comment.char="", fileEncoding="latin1")
    I still think it would be preferable if MediaLab would take care of clean data files.
    Last edited by Sven; 09-01-2010 at 09:02 AM. Reason: Workaround added

  4. #4
    Join Date
    May 2008
    Posts
    2
    Quote Originally Posted by Sven View Post
    I seem to have the same problem. R stops reading my csv data at the strange character. The proposed solution however did not help in my case. I tried character set definitions iso-8859-1, windows-1252, utf-8 and us-ascii without success. I am using IE 7.

    Any other ideas?
    in my original troubles with this, i finally resorted to delete these null characters using a decent editor, i think it was notepad++ [<http://notepad-plus-plus.org/download>]. if i remember correctly, it allows to search [using <\0>] and replace [by nothing] the null characters.
    notepad++ is a powerful free software (free as in beer and speech) for windows. so you would have to do search and replace by hand in any data file you have. batch procedures may make that easier if you have many data files.
    in general (and especially under linux) you should be able to do this search & replace, possibly with some regular expression trickery or so, in any solid text editor. it is quite the pain in the butt to do it the first time, but i found - even without remembering exactly how i did it in a particular instance, that it will come back very quickly next time around.
    make sure you try it out on a copy of a data file first and then if it works alright go on to the original files to prevent data loss.

    hope this helps.

    johann

Similar Threads

  1. recording data from custom html item
    By ljubica in forum MediaLab Older Versions: Troubleshooting
    Replies: 1
    Last Post: 03-23-2010, 09:39 AM
  2. position and size of html in custom items
    By MandaHyde in forum MediaLab Older Versions: How Do I...
    Replies: 2
    Last Post: 10-29-2009, 12:43 PM
  3. Problem with having only 1 variable in Custom Item HTML
    By xaros in forum MediaLab Older Versions: Troubleshooting
    Replies: 3
    Last Post: 11-25-2008, 12:55 PM
  4. Problem with ü (umlaut) when using html-custom items
    By Henrik in forum MediaLab Older Versions: Troubleshooting
    Replies: 1
    Last Post: 11-17-2008, 04:19 PM
  5. Saving Data from Custom HTML Text Boxes
    By brewer in forum MediaLab Older Versions: How Do I...
    Replies: 3
    Last Post: 09-26-2006, 03:25 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •