Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

C# to python

As a long time C# developer, I started with Python as a second language for ML purposes. Starting in Python is easy, doing engineering grade python turned out to be a lot harder, so these are 10 things I learned along the way to writing production code in Python.

  • Login to see the comments

C# to python

  1. 1. FROM C# TO PYTHON 10 THINGS I LEARNED ALONG THE WAY Tess Ferrandez
  2. 2. TESS SOFTWARE ENGINEER & DATA SCIENTIST at MICROSOFT
  3. 3. NOTEBOOKS ARE FOR EXPLORATION 1
  4. 4. REUSE CODE SOURCE CONTROL DEBUG TEST CI/CD PIPELINE
  5. 5. IF IT’S GOING INTO PROD, IT’S GOING IN A .PY FILE
  6. 6. PYTHON IS VERY FLEXIBLE 2
  7. 7. # read the data df = pd.read_csv('../../data/houses.csv') # print the first five records print(df.head()) # plot the price df.price.plot(kind='hist', bins=100) plt.show() IMPERATIVE
  8. 8. def clean_region(region: str) -> str:… def clean_broker(broker_name: str) -> str:… def clean_data(input_file: str, output_file: str):… if __name__ == '__main__': clean_data('data/interim/houses.csv', 'data/processed/houses.csv') PROCEDURAL
  9. 9. FUNCTIONAL def square(x: int) -> int: return x * x numbers = [1, 2, 3, 4, 5] num_sum = reduce(lambda x, y: x + y, numbers, 0) squares = map(square, numbers)
  10. 10. OBJECT ORIENTED class StringOps: def __init__(self, characters): self.characters = characters def stringify(self): self.string = ''.join(self.characters) sample_str = StringOps(['p', 'y', 't', 'h', 'o', 'n']) sample_str.stringify() print(sample_str.string)
  11. 11. YOU CAN MIX AND MATCH PARADIGMS AS YOU PLEASE, BUT KEEP YOUR CODE AND SOCKS DRY
  12. 12. USE A COOKIE CUTTER PROJECT STRUCTURE 3
  13. 13. A BIG PILE O’ FILES clean_dataset.py clean_dataset2.py clean-2019-02-01.py clean-tf-1.py super-final-version-of-this- cleaning-script.py
  14. 14. MAKEFILE SETUP DOCS NOTEBOOKS / REPORTS REQUIREMENTS.TXT TESTS SEPARATELY
  15. 15. USE A COOKIE CUTTER PROJECT STRUCTURE OTHER PEOPLE WILL THANK YOU
  16. 16. USE A COOKIE CUTTER PROJECT STRUCTURE OTHER PEOPLE I WILL THANK YOU (PERSONALLY!)
  17. 17. WRITING READABLE & MAINTAINABLE CODE 4
  18. 18. import random, sys import os def myfunc(): rando = random.random() return random.randint(0,100) def multiply (a, b): return a * b print(multiply(myfunc(), myfunc()))
  19. 19. PEP8.ORG PYTHON ENHANCEMENT PROPOSAL
  20. 20. import random, sys import os def myfunc(): rando = random.random() return random.randint(0,100) def multiply (a, b): return a * b print(multiply(myfunc(), myfunc()))
  21. 21. import random, sys import os def myfunc(): rando = random.random() return random.randint(0,100) def multiply (a, b): return a * b print(multiply(myfunc(), myfunc()))
  22. 22. import random def myfunc(): rando = random.random() return random.randint(0,100) def multiply (a, b): return a * b print(multiply(myfunc(), myfunc())) UNUSED IMPORTS
  23. 23. import random def myfunc(): rando = random.random() return random.randint(0,100) def multiply (a, b): return a * b print(multiply(myfunc(), myfunc())) SEPARATING LINES
  24. 24. import random def myfunc(): rando = random.random() return random.randint(0, 100) def multiply(a, b): return a * b print(multiply(myfunc(), myfunc())) WHITE SPACES
  25. 25. import random def myfunc(): return random.randint(0, 100) def multiply(a, b): return a * b print(multiply(myfunc(), myfunc())) UNUSED VARIABLES
  26. 26. import random def random_number(): return random.randint(0, 100) def multiply(a, b): return a * b print(multiply(random_number(), random_number())) WEIRD FUNCTION NAMES
  27. 27. import random def random_number(): return random.randint(0, 100) def multiply(a, b): return a * b print(multiply(random_number(), random_number()))
  28. 28. # See https://pre-commit.com for more information # See https://pre-commit.com/hooks.html for more hooks repos: - repo: https://github.com/ambv/black rev: stable hooks: - id: black language_version: python3.7 - repo: https://github.com/pre-commit/pre-commit-hooks rev: v2.0.0 hooks: - id: flake8
  29. 29. def add(a, b): return a + b result = add('hello', 'world') result = add(2, 3) def add(a: int, b: int) -> int: return a + b result = add('hello', 'world') result = add(2, 3)
  30. 30. PEP8 ALL THE CODES
  31. 31. A SWEET DEV ENVIRONMENT SETUP 5
  32. 32. PIP, CONDA AND VIRTUAL ENVIRONMENTS 6
  33. 33. pip install pandas conda install pandas
  34. 34. conda create –name myenv python=3.6 conda activate myenv # Install all the things # Work on the application conda deactivate myenv
  35. 35. REQUIREMENTS.TXT
  36. 36. KEEP YOUR MACHINE CLEAN AND YOUR PANDAS SEPARATED
  37. 37. EMBRACING PYTHONIC PYTHON 7
  38. 38. SQUARE SOME NUMBERS
  39. 39. nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  40. 40. squares = [] i = 0 while i < len(nums): squares.append(nums[i] * nums[i]) i += 1
  41. 41. squares = [] for i in range(len(nums)): squares.append(nums[i] * nums[i])
  42. 42. squares = [] for num in nums: squares.append(num * num)
  43. 43. squares = [num * num for num in nums]
  44. 44. squares = [num * num for num in nums if num % 2 == 0]
  45. 45. fruits = ['apple', 'mango', 'banana', 'cherry’] fruit_lens = {fruit: len(fruit) for fruit in fruits} {'apple': 5, 'mango': 5, 'banana': 6, 'cherry': 6}
  46. 46. SUM ALL NUMBERS BETWEEN 10 AND 1000
  47. 47. a = 10 b = 1000 total_sum = 0 while b >= a: total_sum += a a += 1
  48. 48. total_sum = sum(range(10, 1001))
  49. 49. IS THIS ITEM IN THE LIST?
  50. 50. fruits = ['apples', 'oranges', 'bananas', 'grapes'] found = False size = len(fruits) for i in range(size): if fruits[i] == 'cherries': found = True
  51. 51. found = 'cherries' in fruits
  52. 52. LIVE ZEN
  53. 53. >>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
  54. 54. ARGUMENT PARSING WITH CLICK 8
  55. 55. def main(): parser = argparse.ArgumentParser() parser.add_argument('input_file', default='in.txt’, type=str, help=‘…') parser.add_argument('ouput_file', default='out.txt’, type=str, help=‘…') parser.add_argument(‘--debug', required=True, type=bool, help=‘…') args = parser.parse_args() # do some work print(args.debug) if __name__ == '__main__': main()
  56. 56. python myscript.py --help Usage: myscript.py [OPTIONS] [INPUT_FILE] [OUTPUT_FILE] Options: --debug BOOLEAN [required] --help Show this message and exit.
  57. 57. def main(): parser = argparse.ArgumentParser() parser.add_argument('input_file', default='in.txt’, type=str, help=‘…') parser.add_argument('ouput_file', default='out.txt’, type=str, help=‘…') parser.add_argument(‘--debug', required=True, type=bool, help=‘…') args = parser.parse_args() # do some work print(args.debug) if __name__ == '__main__': main()
  58. 58. @click.command() @click.argument('input_file', default='in.txt', type=click.Path(), help=‘…') @click.argument('output_file', default='out.txt', type=click.Path(), help=‘…') @click.option('--debug', required=True, type=click.BOOL, help=‘…') def main(input_file, output_file, debug): print(input_file) print(output_file) print(debug) if __name__ == '__main__': main()
  59. 59. CLICK MAKES ARGUMENT PARSING READABLE AND TESTABLE
  60. 60. TESTING WITH PYTEST 9
  61. 61. def test_add_positive(): assert add(1, 2) == 3
  62. 62. @pytest.mark.parametrize('val1, val2, expected_result', [ # small values (1, 2, 3), # negative values (-2, -1, 3) ]) def test_add(val1, val2, expected_result): actual_result = add(val1, val2) assert actual_result == expected_result
  63. 63. @pytest.mark.longrunning def test_integration_between_two_systems(): # this might take a while
  64. 64. def remove_file(filename): if os.path.isfile(filename): os.remove(filename)
  65. 65. @mock.patch('src.utils.file_utils.os.path') @mock.patch('src.utils.file_utils.os') def test_remove_file_not_removed_if…(mock_os, mock_path): mock_path.isfile.return_value = False remove_file('anyfile.txt') assert mock_os.remove.called == False
  66. 66. A TEST FOLDER IN THE ROOT IS PRETTY NICE
  67. 67. THERE IS A PACKAGE FOR THAT 10
  68. 68. PLOTTING
  69. 69. NEURAL NETWORKS
  70. 70. POSE DETECTION
  71. 71. FOCUSED OPTICAL FLOW
  72. 72. model_path = '../resnet50_coco_best_v2.1.0.h5' model = models.load_model(model_path, backbone_name='resnet50’) image_path = '../data/images/basket_image.jpg' image = read_image_bgr(image_path) image = preprocess_image(image) image, scale = resize_image(image) # process image boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0)) from keras_retinanet import models from keras_retinanet.utils.image import read_image_bgr, preprocess_image, resize_image OBJECT DETECTION
  73. 73. BACKGROUND REMOVAL
  74. 74. THIS IS THE REASON WHY WE DO ML IN PYTHON
  75. 75. 2 3 41 6 7 85 9 10
  76. 76. FROM C# TO PYTHON 10 THINGS I LEARNED ALONG THE WAY Tess Ferrandez

×