Haskell

Pangrams


If you haven’t figured it out by now, I enjoy solving problems. And over the course of the last year or two, I’ve learned that interview questions make for great problems to work on. Actual interview problems are nice because they are usually quick, but have a quirk or two in there that makes them challenging, unlike simple questions like the Fizz Buzz problem that just checks if you have the most basic coding skills. (Has anyone actually been asked that question in an interview?)

The most recent problem I got to sink my teeth into (found it on a recruiting site, but not going to share where I got it; wouldn’t be fair to the company posting the problem) is for finding pangrams in sentences. If you don’t know what a pangram is Wikipedia defines them as, “a sentence using every letter of the alphabet at least once.” Yeah, I didn’t know what they were either until I started programming this little puzzle. Here is the code:

  1. module Main (main) where
  2.  
  3. import System (getArgs)
  4. import qualified Data.Set as S
  5. import qualified Data.Text as T
  6. import qualified Data.Text.IO as TI (readFile)
  7.  
  8. buildList :: FilePath -> IO [T.Text]
  9. buildList filename = TI.readFile filename >>=
  10. return . map (T.toLower . T.filter (/=' ')) . T.lines
  11.  
  12. compareAndPrint :: S.Set Char -> String
  13. compareAndPrint sset = if S.null result
  14. then "NULL"
  15. else S.toList result
  16. where result = S.difference (S.fromList ['a'..'z']) sset
  17.  
  18. main = do
  19. args <- getArgs
  20. sentences <- buildList $ head args
  21. mapM_ (putStrLn . compareAndPrint) $ map( S.fromList . T.unpack) sentences

I came up with the solution pretty quickly by using Sets. Having a set of the alphabet and finding the difference of the letters used in the sentence makes the problem almost trivial. The hard part for me was figuring out how to filter out the spaces and change all characters to lower case in the buildList function. I eventually figured it out, but it took some head against wall action to get it right.

This is going to be my last post for this year. I would like to wish you all Happy Holidays and a Happy New Year. Thank for reading and see you again in 2012. I would also like to thank everyone from planet.haskell.org who decided to read this. Welcome!

Programming Praxis: The Sum of Two Integers

A couple of months ago the Programming Praxis website put up a challenge to find a sum inside an array of integers (the direct wording of the challenge can be found here) and since I’ve come up with my own solution, this little challenge has provided me with a lot of feedback.

Just to get some of the geeky stuff out of the way, here is the code I wrote for the problem:

  1. import Data.List
  2. import Data.Maybe
  3.  
  4. sumCheck :: Int -> [Int] -> [Int] -> Maybe (Int, Int)
  5. sumCheck _ [] _ = Nothing
  6. sumCheck total (x:xs) ys = if total' == Nothing
  7. then sumCheck total xs ys
  8. else return (x, (ys !! ( fromJust total')))
  9. where total' = (total - x) `elemIndex` ys

In thinking about the problem a little bit I came up with this subtraction approach. My first approach was to use addition and add every item in the array against all the other items. But this method didn’t sit well with me. After a little bike ride I came up with the code you see above.

After I wrote it, I submitted my code to the Haskell-beginners email list asking for critiques and possible enhancements. Arlen Cuss contributed a slight improvement of my code:

  1. sumCheck total (x:xs) ys =
  2. let diff = total - x
  3. in if diff `elem` ys
  4. then Just (x, diff)
  5. else sumCheck total xs ys

And Adityz Siram contributed his version. Which is basically the first algorithm that I came up with and wanted to improve upon. His code is here:
  1. sums i as bs = [(x,y) | x <- as, y <- bs, x + y == i]

Finally, Gary Klindt took all of our code snippets, used some performance analysis tools inside GHC and came up with some run times that are (hopefully) more accurate than running time on an application. Here are those stats:
print $ sumCheck 500 [1..1000] [1..1000]
sumCheck1: 58,648
sumCheck2: 58,484
sumCheck3: 70,016

print $ sumCheck 5000 [1..10000] [1..10000]
sumCheck1: 238,668
sumCheck2: 238,504
sumCheck3: 358,016
(unit: byte)

Out of the three code snippets, my function was in the middle, speed-wise. But I think that it’s also really nice to see how much better it is than the regular addition method. It’s also nice to see how the little change made to my code can improve the overall speed of the function.

At the end of the day I take a little bit of pride in myself for coming up with an improved algorithm for this task on my own. I know that on a hardware level, subtraction takes more time than addition. But I get the improvements I get because I reduce the number of additions and comparisons I have to make in order for the function to be complete. I also estimate the worst case speed for my algorithm to be O(n), which isn’t too shabby.

When I started learning Haskell, one of the things I read on the internet was how the people who programmed it were helpful to one another. I was skeptical when I first read that, but I have to say that all of my doubt has been removed. And it is interactions like this that make me glad to participate in a community as helpful as this one.

The Devil is in the Details, part 2

In the last blog post I made with this title, I told a little story on how I helped a learning Haskell programmer see why his version of the code wasn't as complete as the example given. I also thought that this series wouldn't be having a second part to it, but it looks like I was wrong.

I've recently been given a copy of “Learn You a Haskell For Great Good” to write up a review for it. I've been trying to go through that book as fast and thoroughly as I can because I really want to learn the material and because I will be able to give a better review if I actually complete the book. Yeah, big shocker there, I know.

Just a quick personal note before I continue. I'm not trying to point out these short coming in code that I see to please my ego. I'm doing this because I see a pattern that I don't think is correct and I don't only want to prevent this problem from happening within my own code, but also to bring the idea to public forefront so that other people can see what is going on to and (hopefully) learn from it.

In “Learn You a Haskell”, more specifically Chapter 5 – High Order Functions, the book walks someone through how to construct a Collatz chain. The code in the book is:

  1. chain :: Integer[Integer]
  2. chain 1 = [1]
  3. chain n
  4. | even n = n : chain ( n ` div` 2)
  5. | odd n = n : chain ( n * 3 + 1)

and as an example from the book as well, when you run the above code you get this as a result:

  1. ghci > chain 10
  2. [10,5,16,8,4,2,1]

This is the right answer, but much like the code in the last blog post, what happens if you give it a negative argument, let's say -10:

  1. ghci> chain (-10)
  2. [-10,-5,-14,-7,-20,-10,-5,-14,-7,-20,-10,...] #(I'm sure you can see the pattern at this point).

So I propose to make a small change to fix this problem. Here is my version of chain:

  1. chain :: Integer -> [Integer]
  2. chain' 1 = [1]
  3. chain' n
  4. | n <= 0 = []
  5. | even n = n : chain' (n `div` 2)
  6. | odd n = n : chain' ( n * 3 + 1)

At the end of the day, I'm just making the case that when I, or anyone really, programs something up, a quick thought should be given towards what type of input verification should be done before the function/application/whatever runs so that it runs cleanly and correctly. Ok, time to get off my soap box and start looking for these same holes on the programs I've written.

Project Euler: Problem 10

Thank you for your patience as you all (I'm sure) anxiously await my next posting.

Not only has it been a really long time since I posted something technical, it's also been a really long time since I posted a Project Euler solution. On that note, let's go over the challenge:

The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.

Seems simple enough doesn't it? Let's find out. For Python the code is:

  1. #!/usr/bin/python
  2. """
  3. Python based code solution to problem 10 of project euler.
  4. """
  5. import math
  6.  
  7. def is_prime(divided):
  8. divisor = 3
  9. sqrt_divided = int(math.sqrt(divided))
  10. while divisor <= sqrt_divided:
  11. if divided % divisor == 0:
  12. return False
  13.  
  14. divisor += 2
  15.  
  16. return True
  17.  
  18. if __name__ == "__main__":
  19. print sum([2] + [x for x in xrange(3,2000000+1, +2) if is_prime(x)])

And for Haskell the code looks like:

  1. module Main where
  2.  
  3. prime_wrapper:: Int -> [Int]
  4. prime_wrapper divided = prime 3 divided (round . sqrt. fromIntegral $ divided)
  5.  
  6. prime:: Int -> Int -> Int -> [Int]
  7. prime divisor divided sqrt_divided
  8. | divisor > sqrt_divided = [divided]
  9. | mod divided divisor == 0 = []
  10. | otherwise = next_prime
  11. where next_prime = prime (divisor + 2) divided sqrt_divided
  12.  
  13. main :: IO ()
  14. main = print $ sum primes
  15. where primes = 2 : [ x | x <- [3,5..2000000], prime_wrapper x /= [] ]

I will be the first to admit that this problem is a little on the easy side. Mostly because you can reuse the code you created for problems 3 and 7, or at least I did. :)

Even though you and I know that Haskell will execute this code faster than Python, I feel I still need to uphold the tradition of posting my highly inaccurate execution times:

Python: 34.054s
Haskell: 20.690s

If you made it this far down into the article, hopefully you liked it enough to share it with your friends. Thanks if you do, I appreciate it.

Bookmark and Share

Syndicate content