Or maybe not. I came across this programming challenge through means I'm not quite comfortable sharing on the interwebs. So disregard the origin of this challenge, and just put those spare brain cells toward the challenge and my answer.
The challenge:
Write a program that assigns products to sites in a way that maximizes the scores over
the set of customers. Each customer can only have one product and each product can
only be offered to one customer. Your program should run on the command line and
take as input two newline separated files, the first containing the names of the products
and the second containing the names of the customers. The output should be the total
SS and a matching between customers and products. You do not need to worry about
malformed input, but you should certainly handle both upper and lower case names.
The algorithm for generating the number is as follows:
• If the length of the product name is even, the base score is the number
of vowels in the customer’s name multiplied by 1.5.
• If the length of the product name is odd, the base score is the number of consonants in
the customer’s name multiplied by 1.
• If the length of the product name shares any common factors (besides 1) with the
length of the customer’s name, the score is increased by 50% above the base score.
My answer:
#!/usr/bin/python import sys from optparse import OptionParser def build_list_from_file(filename): """ builds a list from a newline delimited file. INPUT: string object, the filename OUTPUT: a list object with the lines of the file """ temp_list = [] try: with open(filename) as file: for lines in file: temp_list.append(lines.strip()) except IOError: print "the file %s does not exist. Please look into it." % filename sys.exit(1) return temp_list def count_vowels(name): """ counts the number of vowels in a string. INPUT: string object, the string to be parsed OUTPUT: int object, the number of vowels in the string """ vowels = 0 for x in name.lower(): #setting all characters to lower case for comparison for y in ['a', 'e', 'i', 'o', 'u']: if x == y: vowels += 1 return vowels def count_consinants(name): return len(name) - count_vowels(name) def generate_ss(matrix_object): """ The function to generate the SS score based on the product name and customer name. INPUT: tuple object, with the name in the first element and the product name in the second OUTPUT: a float object with the value after the computation """ cust = matrix_object[0] prod = matrix_object[1] cust_factors = get_factors(cust) prod_factors = get_factors(prod) ss = 0.0 if len(prod) % 2 == 0: ss = float(1.5 * count_vowels(cust)) else: ss = float(count_consinants(cust)) # muliplicaton by 1 is implied. # by using sets, we can now use itersection to see if there are any # common factors between the two names if len(cust_factors.intersection(prod_factors)) > 0: ss *= 1.5 return ss def get_factors(name): """ Retrive the factors of a particular number. INPUT: int object, number to gather factors of. OUTPUT: set object, the factors of a particular number """ # using a variable instead of calling len() twice, should be faster top_num = len(name) return set([x for x in range(2,top_num+1) if top_num % x == 0]) if __name__ == "__main__": """self explanitory... I hope.""" parser = OptionParser() parser.add_option("-p", "--prod", dest="prod", help = "Path to the prodcts file.Which should be newline delimited file.") parser.add_option("-c", "--cust", dest="cust", help = "Path to the customer file.Which should be newline delimited file.") (options,args) = parser.parse_args() if not (options.prod and options.cust): print "A file for customers (-c) and one for products (-p) are needed." sys.exit(1) customer_list = build_list_from_file(options.cust) prodcut_list = build_list_from_file(options.prod) matrix = [] final_list = [] matrix = [(a,b) for a in customer_list for b in prodcut_list] ss_list = map(generate_ss, matrix) # since we're zipping things up, might as well sort it # with largest SS value first zipped_by_ss = list(reversed(sorted(zip( ss_list, matrix)))) # need to prime the list in order for this to work final_list.append(zipped_by_ss[0]) del zipped_by_ss[0] while 1 < len(zipped_by_ss): j = 0 while j < len(zipped_by_ss): if ( zipped_by_ss[j][1][0] is final_list[-1][1][0] or zipped_by_ss[j][1][0] is final_list[-1][1][1] or zipped_by_ss[j][1][1] is final_list[-1][1][0] or zipped_by_ss[j][1][1] is final_list[-1][1][1] ): del zipped_by_ss[j] continue j += 1 final_list.append(zipped_by_ss[0]) del zipped_by_ss[0] for item in final_list: print "%s has item %s, with SS score of %s." % (item[1][0], item[1][1],item[0])
Thoughts about the code
Before I posted this code here, I actually had the matrix generation done with two for loops. Changing it to the list comprehension that is in the code reduced the code size by a good eight lines or so, increased the readability of that section of the program, and gave me a bit of a speed up to boot. I will remember this little trick for the future.
When I was in school I took the requisite algorithms course, but that was mostly about sorting and some other things. I don't recall them going over a “duplicate removal” algorithm. For this part of the program I believe my program runs correctly though I feel like it’s ugly code. Doing the removal of duplicates with two indexes made sense to me for removing the duplicates in line, but I feel like this could be optimized more. I'm totally open to comments on all of my code, but I would really appreciate it if people would focus more on that aspect of the program. Also, if anyone knows of any “duplicate removal” style algorithms, or places where I could learn more about them, I would be very grateful.

Bryce, I know it's wrong, but
Bryce, I know it's wrong, but at least you tried, and posted a wrong approach on the net trying just to be better at coding.
I thought Ted's response a little harsh, but it's his personality I guess. (I read his blog, and without his acid writing it wouldn't be that interesting)
If they asked you to not post this elsewhere from milo, they are right. But if they didn't explicity said something about it, IMHO you did right in not saying from where the challenge came from, and if they didn't give you feedback, well, you're also right in looking from feedback elsewhere.
Posting what we do know it's wrong, just to be criticized and improve our skills is an act of courage. Do not give up and keep trying the challenges!
Take care!
Thank you very much for your
Thank you very much for your comment __B__. It came at a time when I really needed something uplifting and encouraging.
A piece of philosophy I received at one point in my past goes, "Always assume positive intent." That is what I had when I changed up the challenge so that I could post it along with my answer. I changed the question so that it and the possible solution wouldn't be easily found in a search engine. I removed all references to Milo for the same reason. That way Milo could still use it as a recruiting challenge and I could get my answer as to why it was wrong.
For the record, so Milo can fix this if they choose, there is nothing stating that the question is to not be shared. I know that the challenge is not my IP, so I am still happy to pull the question if asked to do so.
Swing and a miss
Hi,
I'm the co-founder of Milo.com, the place where you found this engineering challenge (http://milo.com/jobs). It's kind of a bummer that you posted this without attribution, but what's more of a bummer is that you got it completely wrong, and missed the point of the exercise.
I must say it's one of most
I must say it's one of most interesting and hardest recruiting puzzles I've seen. Many companies use fizzbuzz or simple encryption algorithm and still claim most candidates fail. Out of curiosity: how is it working for you? Do you get many correct solutions?
Also: are you open to telecommute? ;-)
Hey Ted thanks for
Hey Ted thanks for writing.
You are correct, Milo is where I got challenge. I applied for a job, submitted this code, and was rejected for more interaction with your company. I asked the head hunter to ask why my submission was rejected and if it was due to being incorrect, if I could get some direction on how to improve it, but I never heard back. I am not angry with what happened, nor am I trying to be malicious. I am merely seeking the eyes of others in trying to figure out what about my code was incorrect. If you wish for me to remove this then please let me know and I will comply.
This is classic
This is classic combinatorics/graph theory problem called Assignment problem[1] with some weird input format. To get correct answer you can use hungarian algorithm[2] or simple min-cut max-flow. Your greedy approach will give wrong answer on some inputs.
[1]: http://en.wikipedia.org/wiki/Assignment_problem
[2]: http://en.wikipedia.org/wiki/Hungarian_algorithm
Interesting coincidence,
just last week I took a real-time test for a different company and the one puzzle turned out to be reducible to the assignment problem too. It wasn't obvious from the description and although I managed to figure it out, I didn't have enough time to implement the hungarian algorithm; I was supposed to answer within 80 minutes! I haven't heard back from them so I guess my optimal answer without an implementation was not worth much; perhaps a greedy or even a brute force working solution would rank higher. Oh well, their loss.
Hey Duncan, Thanks for the
Hey Duncan,
Thanks for the comment and that is an interesting coincidence. Maybe this type of question (or style of questioning) is starting to become more prevalent in interviews. Just ask 1 big complicated question and see how the candidate does and base the hiring descision off that. I don't agree with it myself, but then I'm not on the hiring side of table.
I had no idea how far off base I was with my solution until I read about assignment problem and the Hungarian algorithm. Oh well, it was a great learning exercise.
Sorry to hear about that particular job not contacting you back, better luck next time.
Hey gislan, Thanks for the
Hey gislan,
Thanks for the information and links, I will study them thoroughly.
wrong solution?
It doesn't seem you program solves the problem correctly. As far as I understand the challenge, the idea is to maximize the total sum of scores. However your program assigns products to customers by greedily choosing the biggest score among available pairs, which does not necessarily produce the biggest total score. Here is a counterexample:
Products:
even
odddd
Customers:
AAA
AAAABBB
Your program produces:
AAAABBB has item even, with SS score of 6.0.
AAA has item odddd, with SS score of 0.0.
with the total score 6.0, while the correct answer is
AAAABBB has item odddd, with SS score of 3.0.
AAA has item even, with SS score of 4.5.
with the total score 7.5.
Hey Kirill, Thank you for
Hey Kirill,
Thank you for writing and showing me where I went wrong in my solution. I appreciate it.
Post new comment