Tuesday, September 16, 2008

Day 2 - Session 6 - Case Study for HarrisData

So this is extremely far from a technical discussion, but important in our industry none-the-less... It's often we only look at architecture, proper ways of developing, and of course, patterns. We normally forget the entire user experience, and sales aspect, and this is a tribute to that in the form of a case study.

Talking about customer retention, and how it's very important to show added value for upgrades, as well as a comfort level of upgrading with minimal effort.

It's also important internally to figure out what caused the upgrade to take place to begin with. This also helps with sales.

Why HarrisData chose PHP:
  • Obligitory PHP answers (price, ease of use, etc).
  • Can do procedural (for legacy developers), or OOP. Java would have destroyed setups and developer ramp-ups.
  • Professional IDE improving productivity and quality
  • Market acceptance (many people already using PHP and understand it).
  • Ideal application arch (server-centric vs desktop, reliable, secure, flexible)

Day 2 - Session 5 - Secure Application Life Cycle

Secure apps are apps that do what they're supposed to do, ALL the time.

Application information must be available, have integrity, and confidential.

Where do you implement security?
Most people consider security only on the external interfaces. This is a fallacy.
You should be implementing security/sanity checks throughout your entire application to avoid issues throughout your app.

Who is a threat?
  • Script Kiddies
  • Hackers
  • Crackers
  • Unconscious users (no, not knocked out, but rather they don't know what they're doing)
  • Your own framework (modules talking to modules).
  • Physical environment.
Approach to solve issues:
  • Take the entire SDLC, and create specialized security methods for each portion.
  • Securing your application is NEVER done. Each release is an iteration of security, and must be revisited on each release.
Requirements:
Functional and non-functional requirements both input to the application, but security needs to be considered a functional requirement. Generally these fall under the System requirements.

Test plans:
  • Training
  • Awareness
  • Outside-the-box thinking
  • Codified security test plans
  • Use tools
  • Review application w/programmers
  • Reporting and analysis
  • End goal: clean bill of health
Look fors:
  • Remote code execution
  • XSS
  • SQL Injection
  • PHP Configuration
  • File system attacks

Best Practices:
  • Whilelisting vs. blacklisting
  • filter input, escape output
  • Keep errors to yourself (ie, invalid password, 3rd letter correct.. right...)

Day 2 - Session 4 - Scalability via Zend Platform

So far we've briefly discussed the purpose of scalability, and what it means to be scalable.

Root-cause analysis:
When an event is captured, it's context is also saved. If a POST was made, and it threw the error event, then it would capture that POST data in order to provide contextual information to isolate the problem.
This also integrates with Zend Studio, so you can debug/profile immediately. You can even reproduce the HTTP request that triggers the problem.

Examples:
Showing an example of division by 0, and how the event is displayed in the Platform. You get tons of nice information such as the file hit, the file the error is in, the php error, event occurrence information (ie, how many times, first and last error trigger), method parameters, and the entire stack.

Comprehensive Performance:
Performance is affected by many factors, such as:
  • Network load
  • PHP processing time
  • Server load
  • DB load
  • Logic in the application
A key takeaway among these performance methods is that Zend Platform can cache items all along the way.

Clustering:
  • 1 key issue is sharing session data across cluster nodes.
  • Shared storage becomes a bottleneck (NFS going down, DB going down).
Job Queues:
USUALLY:
  • PHP is isolated, nothing sharing for parallel execution.
  • Job's must finish before another starts
  • No asynch execution
  • Background processing requires "hacking"
NOW:
  • Jobs executed in background
  • scheduled via API or a web gui
  • elaborate scheduling rules
  • job failure handling
  • monitoring and execution stats.


Personal notes about the features which are badass:
Yeah that's my official title.
  • You can tell which server exactly the error comes from.
  • Clustering is scalable quickly and easily.
  • Caching can be set up for a lot of different things.
  • You can potentially have backup plans based on clusters.

Day 2 - Session 3 - RIA Applicaitons w/ZendFramework

Zend AMF is an OS implementation of Adobe's AMF(Action Messaging Format), which is a binary protocol that the flash player uses to store objects.

Can serialize any object in Flash to AMF.

-------
So far, most of this discussion is really obvious things, like Flex being a stateful application language.
-------

Day 2 - Session 2 - of Haystacks & Needles

Presented by Derick Rethans (dr@ez.no)
Slides at: http://derickrethans.nl/talks.php


Before searching, you must index, which requires:
  • Finding documents to index (crawl)
  • Separate the docs into indexable units (tokenizing)
  • Massage the found units (stemming)
Crawling:
  • Domain specific: file system, CMS, Google, etc.
  • Should have different fields of a document: title, description, meta-tags, body.
  • Crawling strategy must be determined based on domain.
  • Text:
    • global, whitespace (explode on space), continuous letters (like whitespace, but includes special chars)
    • Define stop-words that won't be included (the, of, and, or, etc)
    • define synonyms (ie, British vs American words)
    • normalize text (remove special chars w/regular chars)
    • Japanese/Chinese texts are difficult, and require special tools to interpret.
  • Stemming:
    • Porter stemming
    • Language dependent
    • Many algorithm's exist.
    • ex: arrival -> arrive, skies -> sky
    • Alternatively can use soundex or metaphone
  • Types of searching:
    • words, phrases, boolean: airplane, "red wine", "wine - red"
    • facetted(categorized) search: limit results by categories defined to found results, usually an iterative process. Can be "document type", et al.
  • MySQL FullText searching: use MATCH() and AGAINST(). Also has a lot of limitations.
Apache Lucene:
  • Implemented in Java
  • Powerful query types
  • Ranked searching
  • fielded searching
  • Proximity queries (search for words close to another word)
  • Zend Lucene port to PHP, but not as feature rich (although growing).
    • Keywords not tokenized or stemmed.
    • UnIndexed fields
    • Binary fields
    • Text/Tokenized Fields
    • UnStored, tokenized, but only indexed
Apache SOLR: Lucene access via a webservice.


So, this session is actually rather boring... I was expecting more theory rather than discussing the tools that are mostly used. I'm interested in Lucene, sure, but in reality I'd like to know how search engines work... Surely Google doesn't use Lucene?

Oh well, onto the next session!

- Spaz

Day 2 - Session 1

There's a beginning keynote this morning about how PHP leaders are transforming high-impact applications.... We'll soon see what exactly this means..

----

Started keynote... He's talking about how wired we have been for the past few years.

Keith Kacey (sp?) noted for organizing UnCon.

If you answer your phone, you're going to be ridiculed... That's that...

CEO Harold Goldberg walks on stage.
  • Largest ZendCon ever. Thanking everyone in the PHP community for making ZendCon possible.
  • Comparing ZendCon/PHP innovation potential to ancient technologies such as an Eel/Fish trap, honey-bee ranches, orchard sprayer, etc founded in this area in California.
  • 3 main take-aways:
    • PHP is Poised for Widespread Enterprise Adoption.
    • YOU are central to PHP's success.
    • Together, We're making History.
  • Examples:
    • Kargo Mobile Telephony replacing Java w/PHP. Did anyone doubt this happening?
    • PHP is faster, uses less resources, and allows faster roll-outs.
    • New site, all in ZendFramework, with 400% capacity, at an undisclosed amount of less hardware.
    • Tons more, all with basically the same story.
  • Top choices to use w/AJAX: PHP!
  • Estimation of 40% of PHP jobs moving to Enterprise level development.
  • More and more jobs are PHP related.
  • Declaration that Zend Certs can demand 25-35% more compensation.
  • Surpassed 8 million raw downloads of ZendFramework (from v1 to v1.6)
  • New Zend Framework Certification officially announced.
Why is PHP thriving so well?
  • ZendFramework
  • Eclipse
  • PHP Advancements
  • Support for the most well-used OS (Windows, Linux, MacOS, IBM-i)
  • Tooling:
    • phpMyAdmin
    • Eclipse
    • phpUnit
    • etc
  • Key Tech:
    • XML
    • MySql
    • Oracle
    • Flex
    • Ajax
    • Dojo
    • etc
  • PHP Applications:
    • Magento (new eCommerce site built completely in ZF)
    • SugarCRM
    • WordPress
  • NEW ZEND STUDIO FOR ECLIPSE (6.1) ANNOUNCED
  • Best Practices and reuse for PHP
  • Powerful testing
More stuff about how people are using PHP, specifically with IBM i5, and a new ZendCore for IBM i5 platform.

Adobe is helping PHP implement AMF support... YAY!!!!!

Stream Energy (my company) just got mentioned... WOOT BITCHES!

OK.. end of Keynote.... Nice way to finalize it all up...

- Spaz