Thursday, August 13, 2015

Fisher-Yates Shuffle, Reservoir Sampling, and Random permutations of N

I recently had a conversation with someone who introduced me to the concept of Reservoir Sampling, which is a technique of generation for a combination of k randomly selected elements when there are N elements total, where N is unknown or larger then available memory, and you must select elements without replacement.


Here's a version I just wrote:

import random

"""
iterator is an Iterator/Generator object for a stream

and N >= k
"""
def reservoir(iterator, return_count):

    arr = []
    current = 0
    for item in iterator:
        if current < return_count:
            arr.append(item)
        else:
            rand = random.randint(0, current)
            if rand <= return_count:
                arr[rand] = item
        current += 1
    return arr


Once I read it, I noticed that it is strikingly similar to an inside out Fisher-Yates Shuffle (which is how you you return a random permutation of a list from a list)

That would look like:


import random

def shuffle(cards):
    deck = []
    current = 0
    for card in cards:
        rand = random.randint(0, current)
        if rand != current:
            deck.append(deck[rand])
        deck[rand] = card
    return deck

That said, with reservoir, you won't have a random permutation of the elements.  And shuffle uses extra memory if the length of the card pile is very long, and you only need some k cards.

Thus:
        
 import random

 def resshuffle(iterator, return_count):
    arr = []
    current = 0
    while (current < return_count):
        rand = random.randint(0, current)
        if current < return_count:
            arr.append(arr[rand])
            arr[rand] = iterator.next() 
         current += 1   
   for item in iterator:
            if rand <= return_count:
                arr[rand] = item
        current += 1

    return arr

Sunday, February 8, 2015

Nurse Scheduling Problem and Optimization

I had an interview on Friday, and one of the questions has been on my mind, because I didn't really like how vague my answer was. It was presented as a design problem, but really it's a software engineering problem.  And it's a problem that I might be working on, if hired.

It turns out that it's something of an instance of the Nurse Scheduling Problem.  Nurses need to work a certain number of shifts, shouldn't have back to back shifts, should work about equally each week, then you have to consider different skill levels, shift preferences etc.

This isn't the exact question, of course... I'm not getting hired at a hospital.  But considering I've been scheduling a lot of interviews, it strikes me that having software engineers do technical interviews of candidates is something of the same problem.  Less busy, but still, a structure lots of people in tech can understand.

It's easy to think of additional constraints, either hard constraints which if violated make a schedule invalid, and soft constraints which can be used to rate valid solutions.  Maybe you have an open office, and two people shouldn't interview simultaneously if they work next to each other.  Soft constraint. Some people can interview for front end, some for back end, some for mobile, some for embedded, some for multiple.  Maybe you have all of those at your company, and need to assign individuals to particular candidates.  You may not give a good interview if you assign the embedded guy to interview someone to see if they actually know android development.  Hard Constraint.

Let's say we have d developers who can interview, i individuals interviewing, and t  time slots that a candidate says they are available and lets say we want to organize based on week.

 Even with just d and i  and the hard constraints, the naive solution will be in polynomial time. Now finding the optimal solution, with the soft constraints, the time slot issue... this is the nurse scheduling problem, and it's NP hard.

Imagine you are a manager assigning people, that's a lot of things to keep track of.  Add in sick days, projects that are priorities, and a manual system is going to be pretty stressful and complicated.

So obviously I'm not looking for an optimal solution.   But I've been thinking on heuristics that might be helpful.

Time slots where only one person is scheduled are easier to sub for.
Everyone may be equal, but multiple certifications can sub for more people, and can interview more people.

It strikes me that there are a variety of ways to simplify the problem.   So I'll try to solve the general problem programmatically and handle corner cases in an adhoc manner.
list available time slots in a calendar.  Have the the Candidates pick one time slot to commit to, then reschedule if needed. If a time slot overlaps, have a rough indication of interviewers available.  Green for multiple interviewers, red for 1, disabled for 0.

Limit time slots. A morning slot and an afternoon slot.  If none of the time slots work, the candidate can inform the company of when they are available by emailing.

Keep track of how many interviews a person does in a week, and when randomly selecting individuals, prefer selecting people who have done less.

I don't need to have an optimal solution, just a reasonably good one that makes finding workable solutions easier.   And it's okay that I didn't have perfect answers.  I don't need to design a perfect system, just one better then what they have currently.  Optimization isn't always about finding the optimum way to do things.  As long as you can make something better, you have improved it.









Friday, December 5, 2014

A work flow for setting up front end development professionally on ubuntu

Introduction

I noticed that there wasn't a really good step by step guide for setting yourself up professionally to develop a web app from step 1 on Ubuntu. I'm a vim user, and setting vim up for javascript is a whole other post or two, but this should be editor agnostic. I don't know if this is perfect, for example I will probably add a test runner and test framework.
Suggestions for things to add, especially test setup, are welcome.
  • Karma and Jasmine for unit testing?
  • Protractor and Selenium Web Driver for End to end testing?
Still I don't know that this is all collected anywhere.

Summary of steps

  1. Install Requirements to build nvm and git
  2. Install nvm
  3. Make your .bash_profile source nvm and install the stable version of nodejs
  4. Make your .bash_profile run nvm use stable
  5. Use npm to install yo, grunt-cli, and bower
  6. Run yo and install angular generator
  7. Run yo angular to generate angular
  8. Use grunt serve to display an example front page
  9. Use yo to add angular directives, controllers etc.
  10. Develop your app

1 Install Requirements to build nvm and git

We are installing nvm so we don't have a local version of node and don't have to constantly invoke sudo to install stuff with npm.
First we install requirements to build nvm as well as download it:
sudo apt-get install build-essential libssl-dev curl git-core

2 Install nvm

You can use the very insecure and easy to hijack:
curl https://raw.githubusercontent.com/creationix/nvm/v0.20.0/install.sh | bash
Hopefully though, you are a bit smarter, and also want control over what you do.
Choose the Manual Install:
git clone https://github.com/creationix/nvm.git ~/.nvm && cd ~/.nvm && git checkout `git describe --abbrev=0 --tags`

3 Make your .bash_profile source nvm and install the stable version of nodejs

add the following to your .bash_profile
export NVM_DIR="/home/lawrence/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh"  # This loads nvm
NVM_SYMLINK_CURRENT="true"
open a new shell, it should source nvm as above.
run the following to install the latest stable version of node.
nvm install stable

4 Make your .bash_profile run nvm use stable

Add the following to the bottom of your .bash_profile to have node always running
nvm use stable > /dev/null
run nvm use stable in your shell and there you go.

5 Use npm to install yo, grunt-cli, and bower

simple:
npm install --global yo bower grunt-cli

6 Run yo and install angular generator

type yo at the command line and search for angular, then install it.
yo
Why are we installing yeoman? It basically sets up grunt tasks for us, so we can have grunt do lots of things for us without having to understand grunt.
What are grunt tasks? Minification of javascript files for example.

7 Run yo angular to generate angular

yo angular
I'm using angular as an example, but there are plenty of generators for yeoman.

8 Use grunt serve to display an example front page

grunt serve

9 Use yo to add angular directives, controllers etc.

run yo as needed.

10 Develop your app

Simple enough

Saturday, October 5, 2013

Monday, August 19, 2013

Samsung Hackathon

This past Weekend I attended a Hackathon, my first, presented by Samsung, over at the PayPal Campus in San Jose.  It had a variety of sponsors, including Immersion Haptics and PayPal.

The website was on http://hackathons.splashthat.com/


I thought I would comment on what I thought done right, and what could have been improved in my opinion.

First of all, it was a large and successful event, where 21 teams presented.

They fed us 4 meals over two days, and had snacks, candy, and drinks to boot.  The food was plentiful, and there were vegetarian options, which was nice. There was plenty of food, and they didn't clear left overs away immediately, so I had a slice of pizza from lunch at 3 am, and mexican food from dinner after breakfast.

There was also keg of beer those over 21 could get a drink from with dinner.  I don't know what type of beer, but it was tasty.

They opened up a bunch of rooms on two floors for our use in the building.
In fact, the staff, both the PayPal employees and the event staff were wonderful and welcoming.  There were volunteers there, including one guy, I believe from Holland, who stayed the whole night and kept going.  People from the company who were helping with the API stayed until 8, and were there in the morning.

There were people there willing to help with the APIs, (but see below).

They gave us phones to work on, in exchange for Driver licenses, and later, credit cards (no charge), so that we could keep the devices over night.  This helped a lot, especially with the Chord API, since it is Samsung specific.

There were however a few things that I thought could have gone better.  They didn't prevent me from enjoying the Hackathon, but I think the sponsors could have gotten more engagement from us with a few small changes.

API issues, Rooms and Prizes

First of all, a large focus of the Hackathon was on the Samsung Chord API.  The Chord API didn't work over the PayPal wifi network.  That should have been tested to begin with, fixed if possible, or more widely announced otherwise.  We were able to get it working over wifi direct, but we could have been told to do that.

The Chord API was hard to use in projects in Eclipse, and the Technical Evangelist suggested to multiple people that they base their project on the sample code for the chord demo, instead of showing how to include it.  We ended up having to do that to get it working, and having to create a new repository to do so.

They also said that there were issues with the Emulator... this was true, since even on a separate wifi AP, we could only get the emulator to receive, not send.

The Chord API seemingly worked differently on different devices.  Our app would crash the Note Phablet we were given, but not other phones.  We also had issues with the same xml rendering certain items on one phone, and not on another, crashing depending on the order of devices, and updates triggered by Chord messages only running on one device.

Given that all the Phones were Samsung devices, this suggests app development across devices may require a lot of testing for idiosyncratic differences.  Given the short timeline of the Hackathon, and our lack of experience with Android we were unable to fix several device specific issues before final submission, which effected the appearance of our final work.

The Immersion Haptic API was fairly easy to use in Eclipse.  That said, there were no instructions for including it in the Android Studio beta, so we ended up switching to Eclipse from that.  I did talk to someone passing out Immersion swag later in the day (see below) and mentioned this, and they said they would find out if there were Android Studio instructions in the pipeline, but I never saw them again, even when I went looking.

Not a huge thing, and understandable, but something they may want to look at in the future, when Android Studio and it's build system become the default for new developers and official documentation.

Also, I believe the Immersion Haptic API License requires you to mention them.  It would have been nice if sample code/xml for this had been provided for the many teams who were new to Android.

They opened up rooms, which was good because there was a lot of crowding to start with, and the initial room had speakers playing music which a few of us found distracting.  We were able to spread out, move furniture around, and nobody got upset with us.  It's hard to find a room that will hold 700 + people, but I think they could have used video conferencing equipment in the rooms so that people didn't have to crowd into the initial room for announcements.  Alternatively, they could have removed the tables and chairs to fit more people, and split us up for development.

PayPal wanted people to try their APIs, and a few teams did, (including the overall winner, Split, who had an awesome check splitting app).
However, the prizes were all for the Chord API or Immersion Haptics API.  I think PayPal would have gotten more engagement if the prizes had mentioned using their API, or they had sponsored a prize themselves.  Unfair, given all they did for us, but developers are going to focus on what the Hackathon instructions say, and the PayPal API wasn't mentioned until the opening announcements.

I didn't attempt to use it, so I am not sure that there were people from PayPal to help, but I think so.  I did glance at the documentation, and it was very nice.

Swag

Swag, IMHO should promote the company to interested parties.    

Some swag like T-Shirts for the Hackathon and from Immersion was given out to people who came early.  But many people left after the first few hours meaning that the swag went to people who didn't necessarily engage with the Hackathon beyond showing up.

PayPal dumped a box of old swag on a table near the beginning of the day (10 am maybe?), and it went quickly.  I did get a sweet multitool.  But that could have been saved for people who engaged with the Hackathon, or even used as a Prize.

Immersion did give out cards with usb drives and copies of their software later in the day after many people had left.  That was the right thing to do, waiting.

I would note that at least one team demoed an app for using the Chord API to notify friends about swag.


Final Thoughts


I personally had a great time. I learned a lot about Android development, quick prototyping, git, etc.

You can see our project on https://github.com/bdettmer/RockPaperMonsterHack

I am interesting in using all three API's in the future. I appreciated the effort it takes to run a Hackathon, and the cost.  I only mention the above in the hope that
future Hackathons can be more productive for the companies.

#samsung #samsunghackathon #immersion #paypal



Thursday, July 25, 2013

When I fell in love with Computer Programming


  • When I was 8 or 9, I had an MS DOS computer.  The monitor was color, and it has QBASIC on it, and I had a book of QBASIC programs.  I remember writing programs that would generate circles, with a random color, size, and location on the screen. I would also generate sounds, beeps really, in time with the circles.  I didn't know about command line arguments, so every change I made I would do by changing the code itself.  The numbers were all numeric literals.  I got it running so fast that the filling of the circles couldn't keep up with their drawing. I liked making it go, and controlling it, and making music and shapes.
  • When I was 13 or, we got a A dialup 28.8k modem, and a Windows  Computer from Gateway.   There was no built in programming language, but I liked chatting online, and eventually started using Mirc and writing a few scripts. There may be an archive of one or two somewhere, because I posted a few to some websites, back in the day.  I learned about variables and conditionals.  I liked getting the program to do what I wanted, and not having to type things repeatedly.  
  • I'm and about to graduate with a Bachelors Degree in Psychology.  I had a great statistics professor, Jack Vevea, who has us do our classes using R.  I did well, and I and another undergrad were invited to one of his graduate level courses in item response theory.    We have a program where we had to calculate a value on 2400 elements.  Each element took about 6 minutes with the program we were given, because it tries to calculate a value to a high degree of accuracy. That's 10 days of non-stop running. I look at the code and figure out how to narrow the search down at each iteration so it runs more quickly.  
  • I'm back in school, and  I am learning C and Java.  I take the data structures and Java for C programmers classes at the same time I take the prerequisite course, work really hard and get an A in all of them.  I like learning, and working hard, and figuring things out.  
  • I find out about the tutor program.  I join it, and and am made a mentor and senior tutor within the first week, on the basis of my work.  I find out that I love helping people learn about programming, and that having to explain stuff makes me comprehend it better.
  • I am looking through job postings for internships using my browser's find in page function, and realize I can write something to automate this for me.  I write the first version of Gutsy, in Perl.  I like being able to take what I know and use it in real everyday life.  
I fell in love multiple times with programming.  It wasn't a linear progression, where I learned to code a little when I was young.  I had to come back to it, again and again, learning new things each time.  


Tuesday, July 9, 2013

quick vim trick for fixing indentation and returning to where you were

So, many people have noted that you can use gg=G to fix indentation in a file.

gg makes the program go to the top of the file, and =G idents until the bottom.   however this places your cursor at the first line in the file, which isn't something you probably wanted to do when you just wanted to fix indentation.

But you can use 2<C-o> to go back to the place you typed the command. <C-o> jumps back, but you need two jumps backward to get back to where you were.

Therefore I've placed nmap <leader>g gg=G2<C-o> in my .vimrc to fix indentation as needed without needed to go back to where you were.