What – if anything – have we learned from C++? by Bjarne Stroustrup @ Curry On/PLE 2015
1. What
– if anything –
have we learned from C++?
Bjarne Stroustrup
Morgan Stanley, Columbia University
www.stroustrup.com
2. Talk aims
• A C++ talk
• No boasts
• No apologies
• No attacks on other languages
• Not restricted to language technicalities
• Lessons that might be of use in the C++ world
• Lessons that might be of wider use
• A few concrete examples (not just motherhood & apple pie)
• Not a complete memory dump
• Open a dialog
• Obviously, I don’t know “all of the answers”
• There is no best language for everything and everybody
Stroustrup - PLE/CurryOn - 2015 3
3. Overview
• C++
– Context, aims, and early design decisions
• Language myths
• Social and technical points
– Standardization, C compatibility, linking, …
• Some key C++ design points
– Generic programming
– Resource management
Stroustrup - PLE/CurryOn - 2015 4
4. Then – early 1980s
• Ken and Dennis had only just proved that semi-portable systems
programming could be done (almost completely) without
assembler
– C didn’t have function prototypes
– Lint was state of the art static program analysis
• Most computers were <1MB and <1MHz
– PDP11s were cool
– VT100s were state of the art
– A “personal computer” about $3000 (pre-inflation $$$)
– The IBM PC was still in the future
• “Everybody” “knew” that “OO” was too slow, too special-
purpose, and too difficult for ordinary mortals
– “if you want a virtual function you cannot have done your analysis right”
Stroustrup - PLE/CurryOn - 2015 5
5. The roots of C++
Assembler
Cobol
Fortran
C++
C
Simula
C++11
General-purpose abstraction
Domain-specific
abstraction
Direct mapping to
hardware
Java
C#
BCPL
Stroustrup - PLE/CurryOn - 2015 6
C++14
6. C++ in two lines
• What is C++?
– Direct map to hardware
• of instructions and fundamental data types
• Initially from C
– Zero-overhead abstraction
• Classes with constructors and destructors, inheritance,
generic programming, function objects
• Initially from Simula
• Much of the inspiration came from operating systems
• What does C++ want to be when it grows up?
– See above
– And be better at it for more modern hardware and techniques
Stroustrup - PLE/CurryOn - 2015 7
7. • Primitive operations => instructions
– +, %, ->, [], (), …
• int, double, complex<double>, Date, …
• vector, string, thread, Matrix, …
• Objects can be composed by simple concatenation:
– Arrays
– Classes/structs
• All maps to “raw memory”
Map to Hardware
Stroustrup - PLE/CurryOn - 2015
value
handle
value
value
value
handle
handle
value
value
8
8. Early Design Decisions (1979/80)
• C compatibility
– Almost: function declarations
– Leave no room for a language below (except assembler)
• Simula classes and class hierarchies
– But all objects treated uniformly
• e.g., class objects on the stack
and integers in dynamic storage
• Generic types and operations
– Using macros
• Yuck!
• Constructors and destructors
– Establishing invariants
– Resource management
• Concurrency support through libraries
Stroustrup - PLE/CurryOn - 2015 9
9. Later (1980s)
• Overloading (1983)
– First: constructors, =, and functions
– [], (), -> (. still missing)
• Exceptions
– Throw object catch by type
• Use class hierarchies for grouping
• Templates
– Types and functions
– Type and value parameters
– Type deduction for template arguments
– Implicit instantiation
• Multiple inheritance
– Especially useful for abstract classes (interfaces)
Stroustrup - PLE/CurryOn - 2015 10
10. Control the Message
Traditional (from 1985 onwards) and far too complex:
• “C++ is a general-purpose programming language with a bias
towards systems programming that
– is a better C
– supports data abstraction
– supports object-oriented programming
– supports generic programming”
Stroustrup - PLE/CurryOn - 2015 11
11. Control the Message
Conventional, simple, popular, and wrong:
• “C++ is an Object-Oriented Language”
– Implied/implies (to many)
• C++ is poorly designed
– C compatibility is a mistake
• Avoid most effective C++ techniques
– Classes not in hierarchies
– Non-virtual functions
– Free-standing functions
– Generic programming
Stroustrup - PLE/CurryOn - 2015 12
12. Language Myths*
• We want a simple language!
– No, teachers want a simple language to teach
• In a semester or a quarter
– No, researchers want a simple language to manipulate
• And extend
– No, programmers what a simple language to start using
– Developers want a comprehensive language to use
• Languages grow over time (massively)
– “Everybody” want “a simpler language with just two more features”
* Not all myths are believed by all people
Not all myths are untrue everywhere
Stroustrup - PLE/CurryOn - 2015 13
13. Language Myths
• We want an intuitive notation!
– People confuse the familiar for the simple
• For new features, people insist on LOUD explicit syntax
• For established features, people want terse notation
• Examples:
– template<typename Container> void sort(Container&); // early
• void sort(Sortable& C); // later
– try { f(x); } catch (Foo&) { … throw; } // early
// plus exception specifications
• f(x); // later
– X* p = new X(2); // early (e.g., Simula)
• X a(2); // laterStroustrup - PLE/CurryOn - 2015 14
14. Language Myths
• We want an efficient language!
– No, most of the time programmers want a convenient language
• However slow
• Most programmers can’t even measure performance
– No, most researchers prefer an inefficient language
• For which it is easy to devise optimizations and improvements
– For many applications, we need an efficient language
• But not for most parts of most applications
• Distributed fat can be expensive
Stroustrup - PLE/CurryOn - 2015 15
15. Language Myths
• We want a language for writing reliable code
– No, many programmers are very intolerant about
• Inconveniences imposed for reliability, performance, or security
• The need to learn new concepts
– No, most programmers do not care
• They will ship when their management says “ship!”
• Lack of professionalism
Stroustrup - PLE/CurryOn - 2015 16
16. Language Myths
• There is a best language
– For everybody and for every task
– One size fits all
• Oh, no!
– These myths confound
• Education
• Practice
• Research
• Language design
• Management
• Funding
Stroustrup - PLE/CurryOn - 2015 17
17. Have a message
• What is the language for?
• Who is the language for?
• What would make the language better?
– Define “better”
– Be specific
• No language can be everything to everybody
• C++
– Provides a direct map to hardware
– Provides very general zero-overhead abstraction mechanisms
– Primarily industrial
• Rewards good programmers
Stroustrup - PLE/CurryOn - 2015 18
18. Standardization
• A necessary evil?
– “You can’t have a major programming language controlled by a single
company”
• Actually, you can: Java, C#, …
• There are many kinds of standardization
– ISO, ECMA, IEEE, W3C, …
• Long-term stability is a feature
– You need a standards committee
• Vendor neutral
– Important for some major users
– Deprives C++ or development funds
Stroustrup - PLE/CurryOn - 2015 19
19. How is ISO C++ Standardization Done?
• 100 members at meetings, 200-300 more online
– Primarily industry
• Democratic process
– one company one vote
– No technical qualifications for membership ($1280/year)
– the aim is consensus
• Committees
– add complexity (every feature can become a cancerous growth)
– add delays
– Have no coherent technical view
– Ensure stability (a major feature)
– Makes it hard to set and maintain direction
– See “WG21” Stroustrup - PLE/CurryOn - 2015 20
21. Compatibility
• A (valid) alternative
– Pascal
• Turbo Pascal
– Object Pascal
– Pascal 2
• Ada
– Modula
– Modula2
• Modula3
– Oberon
– Oberon-2
– Oberon-7
Stroustrup - PLE/CurryOn - 2015 22
• Compatibility
– Is a feature
• Stability over decades
– Is valuable
– Is expensive
– is hard
– Hard to correct mistakes
22. Language Definition
• How do you specify a language?
– Not English!
• Though compiler writers love it
– Don’t define semantics of character sequences
• At least go to ASTs
– Not bottom-up lambda calculus
– Not virtual machine
• Formally describe key interfaces
– Memory model
– Concurrency model
– Vector, List
• Formally verify the implementation of those interfacesStroustrup - PLE/CurryOn - 2015 23
23. Inter-language Interoperability
• From day #1
– C linkage (and Fortran)
– Interoperability was an explicit aim
• No decent linkage
– of C++-specific constructs
• classes and templates
• No C++-specific dynamic linkage
• It takes two to tango
– Interoperability is not just a C++ problem
• A massive problem
– Dragging C++ down to the C level (interfaces)
• No standard-library types, no exceptions, …
Stroustrup - PLE/CurryOn - 2015 24
24. Teaching – massive harm being done
• “C first”
– Low-level hacking with lots of pointers, casts, and macros
• “Pure OOP”
– Deep class hierarchies with lots of virtual functions, pointers, casts,
and macros
• And more
“use of a macro is a sign of a weakness in the design or in the language” – BS
Stroustrup - PLE/CurryOn - 2015 25
25. Community
• The C++ user community has no real center
– Not even the ISO C++ committee
• Many benefit without contributing
• An ISO standards committee has limited scope
• 2013: The C++ foundation
– A major failing
• Somehow, I should have done better
– No funding
• Marketing, library distribution, information exchange
• Lots of “little empires”
– platform, compiler, library, tool supplier, consultants
– Some “little empires” are not so little N*100K users
• Large corporations prefer languages they can own and control
– No differential advantage in a shared, standardized language
• The number of C++ programmers is still growing
Stroustrup - PLE/CurryOn - 2015 26
26. C++
• Language features exist to serve programming styles
• A case study:
– Resource management
Stroustrup - PLE/CurryOn - 2015 27
27. Resources
• A resource is something that must be acquired and later released
– Explicitly or implicitly
– A resource is represented by an object
• C
– Local variables of built-in types
– malloc()/free()
– Pointers to all objects
– Manual initialization and clean-up
• Simula
– Local variables of built-in types
– new (with initialization) for class objects
– References to dynamically-allocated objects of class type only
– Manual clean-up; not integrated with memory management
– Garbage collection
Stroustrup - PLE/CurryOn - 2015 28
handle
Value
28. Resources
• C++
– Local variables of all types
– new (with initialization) for all objects
– delete with implicit clean-up
– malloc()/free() for compatibility
– Pointers to all objects
• And people used a mixture of C and Simula styles
– Complexity
– Leaks
• We can and must do better
– All types are treated uniformly
• Pointers/references
• Allocation/deallocation
• initialization/clean-up
Stroustrup - PLE/CurryOn - 2015 29
handle
Value
29. Resources
• Resource management should not be manual
– We don’t want leaks
– We don’t want complex resource management code
• Pointer manipulation
• Catch clauses
• Dispose idiom
• Resources are not just memory
– Non-memory resources
• Thread handles, locks, …
• A resource should have an owner
– Usually a “handle”
– A “handle” should present a well-defined and useful abstraction
Stroustrup - PLE/CurryOn - 2015 30
handle
ValueGarbage collection is
not sufficient
30. Resources
• All C++ standard-library containers manage their elements
– vector
– list, forward_list (singly-linked list), …
– map, unordered_map (hash table),…
– set, multi_set, …
– string
• Other C++ standard-library classes manage other resources
– thread, lock_guard, …
– istream, fstream, …
– unique_ptr, shared_ptr
• Non-memory resources
– “other resources”
– Some container elements
Stroustrup - PLE/CurryOn - 2015 31
handle
Value
31. Control
• We control object lifetime/life-cycle declaratively
– Creation of objects: constructors
– Destruction of objects: destructors
– Copying of objects
• Construction and assignment
• from on scope to another
– Movement of objects
• Construction and assignment
• from on scope to another
– Access to representation
• At no cost compared to low-level hand coding
Stroustrup - PLE/CurryOn - 2015 32
32. Resource Management
• Use constructors and a destructor
template<typename T>
class Vector { // vector of elements of type T
Vector(initializer_list<T>); // acquire memory; initialize elements
~Vector(); // destroy elements; release memory
// …
private:
T* elem; // pointer to elements
int sz; // number of elements
};
void fct()
{
Vector<double> vd {1, 1.618, 3.14, 2.99e8};
Vector<string> vs {"Strachey", "Richards", "Ritchie"};
// …
} Stroustrup - PLE/CurryOn - 2015 33
33. Resources and Pointers
• Many (most?) uses of pointers in local scope are not exception safe
void f(int n, int x)
{
Gadget* p = new Gadget{n}; // look I’m a java programmer!
// …
if (x<100) throw std::run_time_error{“Weird!”}; // leak
if (x<200) return; // leak
// …
delete p; // and I want my garbage collector!
}
– No: it leaks
• No “Naked New” no “naked pointers”!
Stroustrup - C++ Style - India'15 34
34. Resources and Pointers
• A std::shared_ptr releases its object at when the last shared_ptr to it
is destroyed
void f(int n, int x)
{
auto p = make_shared<Gadget>(n); // make a Gadget{n}
// return a shared_ptr<Gadget>
// manage that pointer!
// …
if (x<100) throw std::run_time_error{“Weird!”}; // no leak
if (x<200) return; // no leak
// …
}
– shared_ptr provides a form of garbage collection
• And general resources are correctly handled
– I don’t want to create any garbage!
Stroustrup - C++ Style - India'15 35
35. Resources and Pointers
• A std::unique_ptr releases its object at when it goes out of scope
void f(int n, int x)
{
auto p = make_unique<Gadget>(n); // make a Gadget{n}
// return a unique_ptr<Gadget>
// …
if (x<100) throw std::run_time_error{“Weird!”}; // no leak
if (x<200) return; // no leak
// …
}
• I don’t create any garbage!
• I don’t impose any overhead compared with explicit new/delete
Stroustrup - C++ Style - India'15 36
36. Resources and Pointers
• But why use a pointer at all?
• If you can, just use a scoped variable
void f(int n, int x)
{
Gadget g {n};
// …
if (x<100) throw std::run_time_error{“Weird!”}; // no leak
if (x<200) return; // no leak
// …
}
Stroustrup - C++ Style - India'15 37
37. • Common problem
– factory functions
– functions returning lots of objects (in containers)
• Often solved by returning a pointer to an object on the free store
– Implies memory management
• Garbage collection
• Counted pointers
• Manual memory management
• That is a source of
– Complexity
– Resource related errors
How to get data cheaply out of a function?
Stroustrup - Essence - India'15 38
pointer
Large
object
38. Move Semantics
• Return a Matrix
Matrix operator+(const Matrix& a, const Matrix& b)
{
Matrix r;
// copy a[i]+b[i] into r[i] for each i
return r;
}
Matrix res = a+b;
• Define move a constructor for Matrix
– don’t copy; “steal the representation”
……..
res:
r:
Stroustrup - PLE/CurryOn - 2015 39
39. So why do we use pointers?
• And references, iterators, etc.
• To represent ownership
– Don’t! Stop! Instead, use handles
• To reference resources
– from within a handle
• To represent positions
– Be careful
• To pass large amounts of data (into a function)
– E.g. pass by const reference
• To return large amount of data (out of a function)
– Don’t! Instead use move operations
Stroustrup - Roots'15 40
40. C++11/C++14
template<typename C, typename V>
vector<Value_type<C>*> find_all(C& c, V v) // find all occurrences of v in c
{
vector<Value_type<C>*> res;
for (auto& x : c)
if (x==v)
res.push_back(&x);
return res;
}
vector<int> v { 0,1,7,3,5,7,13,21,7}; // simple test code
vector<int*> v7 = find_all(v,7);
string m {"Mary had a little lamb"}; // simple test code
for (const auto p : find_all(m,'a')) // p is a char*
if (*p!='a')
cerr << "string bug!n";
C++ in the real world - Budapest - 2015 41
41. Syntactic convergence?
Python C++14
def mean(seq):
n = 0.0
for x in seq:
n += x
return n / len(seq)
auto mean(const Sequence& seq) {
auto n = 0.0;
for (auto x : seq)
n += x;
return n / seq.size();
}
42. We can simplify
• def mean(seq):
return sum(seq) / len(seq)
Stroustrup - PLE/CurryOn - 2015 43
• auto mean(const Sequence& seq)
{
return accumulate(seq) / seq.size();
}
43. But that’s not C++!!!
• Yes, concepts are only a TS
– only implemented on a branch of GCC
• That version of accumulate is mine
– Written in ISO standard C++ plus the TS
• In C++11, I define
template<typename Cont, Typename Acc>
Acc accumulate(const Cont& c, Acc init = typename Cont::value_type)
{
return std::accumulate(begin(c),end(c),init);
}
• Most languages simply do not have an ISO Standard
Stroustrup - PLE/CurryOn - 2015 44
44. Questions?
Key strengths:
• software infrastructure
• resource-constrained applications
C++: A light-weight abstraction
programming language
Stroustrup - PLE/CurryOn - 2015
Practice type-rich
programming
45
46. C++ Information
• www.isocpp.org
– The C++ Foundation’s website
– Standards information, articles, user-group information
• Bjarne Stroustrup
– A Tour of C++: All of C++ in 180 pages
– The C++ Programming Language (4th edition): All of C++ in 1,300 pages
– Programming: Principles and Practice using C++ (2nd edition)
– www.stroustrup.com: Publication list, C++ libraries, FAQs, etc.
• The ISO Standards Committee site
– Search for “WG21”
– The ISO standard: All of C++ in 1,300 pages of “standardese”
– All committee documents (incl. proposals)
Stroustrup - PLE/CurryOn - 2015 47
48. Bjarne’s top ten for C++17
• * concepts (they allows us to precisely specify our generic programs and address the
most vocal complaints about the quality of error messages)
• * modules (provided they can demonstrate significant isolation from macros and a
significant improvement in compile times)
• * Ranges and other key STL components using concepts (to improve error messages for
mainstream users and improved the precision of the library specification “STL2”)
• Uniform call syntax (to simplify the specification and use of template libraries)
• * Co-routines (should be very fast and simple)
• * Networking support (based on the asio in the TS)
• * Contracts (not necessarily used in the C++17 library specification)
• * SIMD vector and parallel algorithms
• * Library “vocabulary types”, such as optional, variant, string_view, and array_view
• * A “magic type” providing arrays on the stack (stack_array) with support for
reasonable safe and convenient use.
* means: we made significant progress in Lenexa or Frankfurt
5/11/2015 Lenexa trip report 49
49. Resources/Ownership
• Garbage collection is neither necessary nor sufficient
– This needs proof
• Not necessary
– I/we need to build many kinds of systems to prove that
• Not sufficient
– Non-memory resources
• Thread-handles, file-handles, locks, sockets, containers holding non-memory
resources
– Resource retention time
– Distributed systems
– NUMA memory
Stroustrup - PLE/CurryOn - 2015 50