Threading/Blocking and avoiding the Beachball
I have been tinkering with the little cocoa I have learned so far. I have put together a simple project that goes through a list of urls one by one, puts the url into the "currently processing" text field, and gets the html from each, and makes sure there is no "error" text(post not found, article doesn't exist, etc.) . That's it for now and it works...kind of. For a short list it seems to work, for a long list it does what it should but the dreaded beachball appears. Also the text field is not getting updated for each url, it is only getting updated after the last url is finished. I have tried a bit of research and found that my threading or blocking is incorrect?????
I am not looking for a specific answer to my problem here, there is also way too much code to post. I am not looking for you guys to code this for me. And all in all it is a throw away project in the end, or at best will need to be recoded. What I am looking for is what do I need to research more to ovoid this problem? Maybe some common errors that cause this?
Here's basically what's going on:
1. button pressed sends 'startDownloading' to appController.
2. startDownloading has a loop inside that does the rest.
I have created a class to house and parse, and return bits of the html.
Todd
Todd
I'll happily look at this for you next week (if I don't go to England). Can you zip your code and send it to me? I won't steal your code - honest - I'm only trying to help.
I'm rather busy during the next few days with opensource stuff which I have promised. Once I'm clear of that, I'll be happy to look at your stuff.
Todd
Can I validate the specification here with you, please? I've written a little script:
#!/bin/bash
count=0
while read line;
do
count=$(($count+1));
url=http://clanmills.com/${line}
echo url = $url lines = `curl --silent http://clanmills.com/${line} | wc -l`
done < $1
echo "processed : $count files"
which reads the file:
BoydWedding.html
NicholasBirthday.html
arizona.html
dennis.html
eventsevents.html
gps.html
gpsfoo.html
popup.html
rollovermap.html
rollovermap1.html
statesonly.html
utah.html
and produces the output:
543 /home/rmills/temp$ readem.sh readem.txt
url = http://clanmills.com/BoydWedding.html lines = 10
url = http://clanmills.com/NicholasBirthday.html lines = 10
url = http://clanmills.com/arizona.html lines = 14
url = http://clanmills.com/dennis.html lines = 28
url = http://clanmills.com/eventsevents.html lines = 509
url = http://clanmills.com/gps.html lines = 83
url = http://clanmills.com/gpsfoo.html lines = 280
url = http://clanmills.com/popup.html lines = 32
url = http://clanmills.com/rollovermap.html lines = 104
url = http://clanmills.com/rollovermap1.html lines = 104
url = http://clanmills.com/statesonly.html lines = 10
url = http://clanmills.com/utah.html lines = 10
processed : 12 files
So, it's reading the names of files from readem.txt and building URLs which it then gets using curl. For simplicity, all I'm doing in a simple line count on the file as proof that all the files are different.
I've written a command-line tool version of this in Obj/C++ (code below) and here's the output:
2010-03-14 20:57:28.599 readem[25593:a0f] http://clanmills.com/BoydWedding.html lines = 11
2010-03-14 20:57:28.697 readem[25593:a0f] http://clanmills.com/NicholasBirthday.html lines = 11
2010-03-14 20:57:28.794 readem[25593:a0f] http://clanmills.com/arizona.html lines = 15
2010-03-14 20:57:28.890 readem[25593:a0f] http://clanmills.com/dennis.html lines = 29
2010-03-14 20:57:29.091 readem[25593:a0f] http://clanmills.com/eventsevents.html lines = 510
2010-03-14 20:57:29.191 readem[25593:a0f] http://clanmills.com/gps.html lines = 84
2010-03-14 20:57:29.300 readem[25593:a0f] http://clanmills.com/gpsfoo.html lines = 281
2010-03-14 20:57:29.397 readem[25593:a0f] http://clanmills.com/popup.html lines = 33
2010-03-14 20:57:29.498 readem[25593:a0f] http://clanmills.com/rollovermap.html lines = 105
2010-03-14 20:57:29.601 readem[25593:a0f] http://clanmills.com/rollovermap1.html lines = 105
2010-03-14 20:57:29.697 readem[25593:a0f] http://clanmills.com/statesonly.html lines = 11
2010-03-14 20:57:29.793 readem[25593:a0f] http://clanmills.com/utah.html lines = 11
2010-03-14 20:57:29.984 readem[25593:a0f] http://clanmills.com/ lines = 212
2010-03-14 20:57:29.984 readem[25593:a0f] processed = 13 files
Let's ignore the arithmetic isn't quite correct and the Obj/C++ version seems to count an extra line (and has 13 files instead of 12). Let's not worry about those details which have to do with a trailing blank line both in readem.txt and the code from pulled from the internet.
Here's the code:
#import <Foundation/Foundation.h>
static int lineCount(NSString* s)
{
NSArray* listItems = [s componentsSeparatedByString:@"\n"];
return [listItems count] ;
}
int main (int argc, const char * argv[])
{
if ( argc != 2 ) return printf("syntax: readem file") ;
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
NSStringEncoding encoding ;
NSError* error ;
NSUInteger urlCount = 0 ;
// get the lines from the input file
NSString* fileName = [NSString stringWithUTF8String:argv[1]];
NSString* fileString = [NSString
stringWithContentsOfFile : fileName
usedEncoding : &encoding
error : &error
];
NSArray* lines = [fileString componentsSeparatedByString:@"\n"] ;
// run over the lines, convert them to URLs and read them from clanmills.com
NSUInteger i ;
for ( i = 0 ; i < [lines count ] ; i++ ) {
urlCount ++ ;
NSString* urlString = [NSString stringWithFormat:@"http://clanmills.com/%@",[lines objectAtIndex:i]];
NSURL* url = [NSURL URLWithString:urlString];
NSString* s = [NSString
stringWithContentsOfURL : url
usedEncoding : &encoding
error : &error
];
NSLog(@"%@ lines = %d",urlString,lineCount(s)) ;
}
NSLog(@"processed = %d files",urlCount) ;
[pool drain];
return 0;
}
Robin,
Thank you for the reply. I have been sick and didn't want to leave you hanging. I have a tone of stuff to catch up on, so I won't be able to read and comprehend your post until later today. Although it does seem like you have the idea. A long list will likely be low 100's. I got a copy of Hillegass's book also and realized I could not subject you to my code...yet. I think I am doing things the hard way, and not fully embracing cocoa.
Thanks again,
Todd
You've got the book. That's the best step you can make in the right direction.
No matter the state of your code, I'd deal with it and never pass judgement. Please let me know if you need help, however you can take as long as you like.
I was rather surprised by the speed of the code I wrote. I think it must be simply pulling the files from the cache - I can't believe it does about 1000 gets/second. However I'm busy with some opensource stuff and I might remember to revisit this. More likely, I'll forget.
However if you ask for help, I'll most certainly help you.