Foros del Web - Ver Mensaje Individual - [APORTE] file_get_contents(), cURL, HTTP_Request

abimaelrc · #7 (**permalink**) 04/08/2009, 12:16

Para ver el estatus de algún sitio, con file_get_contents hay una variable que sostiene la información de la cabecera $http_response_header

Código PHP:

Ver original<?php
file_get_contents("http://forosdelweb.com/");
var_dump($http_response_header);

El problema de esta forma es que file_get_contents, tiene que cargar toda la página y una vez cargada la variable te dará la información. Puedes indicar límites a la descarga indicando el cuarto y quinto parametro que no pase información de la página o lea por ejemplo un solo caracter.

Código PHP:

Ver original<?php
//ningún caracter pasa
file_get_contents("http://forosdelweb.com/",null,null,0,0);
var_dump($http_response_header);
 
//pasa solamente el primer caracter
file_get_contents("http://forosdelweb.com/",null,null,0,1);
var_dump($http_response_header);

Pero hay una forma más sencilla de lograr ver la cabecera y es usando get_headers.

Código PHP:

Ver original<?php
var_dump(get_headers('http://forosdelweb.com/',1));
 
//Este sería la forma de verificar si una página está funcionando o no
$getHeader = get_headers('http://forosdelweb.com/',1);
echo $getHeader[0];

También cURL puede ver los estatus de los sitios web. Para este código verificaremos el estatus de los enlaces. Es una buena forma para ver si los enlaces que hemos posteado en la pagina estan rotos o han sido movidos. Solo escribe el nombre en el navegador
http://localhost/nombre_de_este_arch...s_de_links.com No tiene que ser otra direccion puede ser hasta tus propios archivos.

Una vez que una página se haya cargado, el programa utiliza el XPath para obtener una lista de enlaces en la página. Entonces, después de un preprocesamiento busca cada uno de los vínculos, los enlace son recuperados. Debido a que sólo necesita las cabeceras de estas respuestas, no necesitamos usar el método de GET, esto lo hacemos con la opción CURLOPT_NOBODY. Al activar CURLOPT_HEADER le indica a curl_exec() que incluya en la respuesta la cabecera en la cadena que envia. Basado en la respuesta, el estatus del link es impreso a la misma vez con la nueva localidad si ha sido movido.

Código PHP:

Ver original<?php
$url = $_GET['url'];
 
// Load the page
list($page,$pageInfo) = load_with_curl($url);
 
if(!strlen($page)) die("No page retrieved from $url");
 
// Convert to XML for easy parsing
$opts = array('output-xhtml' => true, 'numeric-entities' => true);
$tidy = new tidy;
$xml = $tidy->repairString($page,$opts);
 
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$xpath->registerNamespace('xhtml','http://www.w3.org/1999/xhtml');
 
// Compute the Base URL for relative links
$baseURL = '';
// Check if there is a <base href=""/> in the page
$nodeList = $xpath->query('//xhtml:base/@href');
if ($nodeList->length == 1) {
    $baseURL = $nodeList->item(0)->nodeValue;
}
// No <base href=""/>, so build the Base URL from $url
else {
    $URLParts = parse_url($pageInfo['url']);
    if (! (isset($URLParts['path']) && strlen($URLParts['path']))) {
        $basePath = '';
    } else {
        $basePath = preg_replace('#/[^/]*$#','',$URLParts['path']);
    }
    if (isset($URLParts['username']) || isset($URLParts['password'])) {
        $auth = isset($URLParts['username']) ? $URLParts['username'] : '';
        $auth .= ':';
        $auth .= isset($URLParts['password']) ? $URLParts['password'] : '';
        $auth .= '@';
    } else {
        $auth = '';
    }
    $baseURL = $URLParts['scheme'] . '://' .
               $auth . $URLParts['host'] .
               $basePath;
}
 
// Keep track of the links we visit so we don't visit each more than once
$seenLinks = array();
 
// Grab all links
$links = $xpath->query('//xhtml:a/@href');
 
foreach ($links as $node) {
    $link = $node->nodeValue;
    // resolve relative links
    if (! preg_match('#^(http|https|mailto):#', $link)) {
        if (((strlen($link) == 0)) || ($link[0] != '/')) {
            $link = '/' . $link;
        }
        $link = $baseURL . $link;
    }
    // Skip this link if we've seen it already
    if (isset($seenLinks[$link])) {
        continue;
    }
    // Mark this link as seen
    $seenLinks[$link] = true;
    // Print the link we're visiting
    echo $link.': ';
    flush();
 
    list($linkHeaders, $linkInfo) = load_with_curl($link, 'HEAD');
    // Decide what to do based on the response code
    // 2xx response codes mean the page is OK
    if (($linkInfo['http_code'] >= 200) && ($linkInfo['http_code'] < 300)) {
        $status = 'OK';
    }
    // 3xx response codes mean redirection
    else if (($linkInfo['http_code'] >= 300) && ($linkInfo['http_code'] < 400)) {
        $status = 'MOVED';
        if (preg_match('/^Location: (.*)$/m',$linkHeaders,$match)) {
                $status .= ': ' . trim($match[1]);
        }
    }
    // Other response codes mean errors
    else {
        $status = "ERROR: {$linkInfo['http_code']}";
    }
    // Print what we know about the link
    echo "$status\n";
}
 
function load_with_curl($url, $method = 'GET') {
    $c = curl_init($url);
    curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
    if ($method == 'GET') {
        curl_setopt($c,CURLOPT_FOLLOWLOCATION, true);
    }
    else if ($method == 'HEAD') {
        curl_setopt($c, CURLOPT_NOBODY, true);
        curl_setopt($c, CURLOPT_HEADER, true);
    }
    $response = curl_exec($c);
    return array($response, curl_getinfo($c));
}

Para hacer la petición usando un proxy puedes hacerlo con cURL de esta forma:

Código PHP:

Ver original<?php
$url = 'http://www.google.com/';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, true);
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
curl_setopt($ch, CURLOPT_PROXY, 'ip:port');
$page = curl_exec($ch);
 
$a = curl_getinfo($ch);
print_r ($a);
 
curl_close($ch);
 
echo $page;

Este código fue tomado de este tema http://www.forosdelweb.com/f18/curl-...8/#post3784335

Para hacer la petición usando un proxy puedes hacerlo con fopen de esta forma:

Código PHP:

Ver original<?php
$opts = array('http' => array('proxy' => 'tcp://127.0.0.1:8080', 'request_fulluri' => true));
$context = stream_context_create($opts);
$fp = fopen('http://www.example.com', 'r', false, $context);

Este código fue tomado de este tema http://www.forosdelweb.com/4147617-post93.html

IPN (Instant Payment Notification) Paypal y cURL
Para enviar a Paypal y verificar si es verdadero la transacción que haya hecho el usuario, usando el IPN (Instant Payment Notification)

Código PHP:

Ver original<?php
// Choose url
if(array_key_exists('test_ipn', $_POST) && 1 === (int) $_POST['test_ipn'])
    $url = 'https://www.sandbox.paypal.com/cgi-bin/webscr';
else
    $url = 'https://www.paypal.com/cgi-bin/webscr';
 
// Set up request to PayPal
$request = curl_init();
curl_setopt_array($request, array
(
    CURLOPT_URL => $url,
    CURLOPT_POST => TRUE,
    CURLOPT_POSTFIELDS => http_build_query(array('cmd' => '_notify-validate') + $_POST),
    CURLOPT_RETURNTRANSFER => TRUE,
    CURLOPT_HEADER => FALSE,
    CURLOPT_SSL_VERIFYPEER => TRUE,
    CURLOPT_CAINFO => 'cacert.pem',
));
 
// Execute request and get response and status code
$response = curl_exec($request);
$status   = curl_getinfo($request, CURLINFO_HTTP_CODE);
 
// Close connection
curl_close($request);
 
 
if($status == 200 && $response == 'VERIFIED')
{
    $str = '';
    foreach($_POST as $k => $v){
        $str .= $k . ' => ' . $v . PHP_EOL;
    }
    file_put_contents('ipn.txt', $str);
}
else
{
    file_put_contents('ipn_error.txt', $response . PHP_EOL . $status);
}

Para obtener el archivo cacert.pem deben ir a http://curl.haxx.se/docs/caextract.html y bajarlo o copiar y pegar el contenido a ese archivo con ese nombre y extensión. Lo deben colocar al lado del archivo que van a usar este código o a la ruta que hayan indicado.

Esta información la tomé de la siguiente página http://www.geekality.net/2011/05/28/...ification-ipn/